Code Runs on Linux but not macOS High Sierra

Hello,

I am trying to run iterate_copy.C, which is supposed to loop over all files in a directory named Data, and run the program testprog.C on these files.

When I try to run this on macOS High Sierra (10.13.3), gcc version 4.2.1, ROOT version 6.12/06, I get the following error:

file name: reco_November2017T90PHscan_PH0_.root

*** Break *** segmentation violation
[/usr/lib/system/libsystem_platform.dylib] _sigtramp (no debug info)
[] (no debug info)
[] (no debug info)
[] (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCling.so] cling::IncrementalExecutor::executeWrapper(llvm::StringRef, cling::Value*) (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCling.so] cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCling.so] cling::Interpreter::EvaluateInternal(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCling.so] cling::Interpreter::process(std::__1::basic_string<char, std::__1::char_traits, std::__1::allocator > const&, cling::Value*, cling::Transaction**, bool) (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCling.so] cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCling.so] HandleInterpreterException(cling::MetaProcessor*, char const*, cling::Interpreter::CompilationResult&, cling::Value*) (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCling.so] TCling::ProcessLine(char const*, TInterpreter::EErrorCode*) (no debug info)
[/Users/djaroslawski/Documents/root/lib/libRint.so] TRint::ProcessLineNr(char const*, char const*, int*) (no debug info)
[/Users/djaroslawski/Documents/root/lib/libRint.so] TRint::HandleTermInput() (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCore.so] TUnixSystem::CheckDescriptors() (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCore.so] TMacOSXSystem::DispatchOneEvent(bool) (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCore.so] TSystem::InnerLoop() (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCore.so] TSystem::Run() (no debug info)
[/Users/djaroslawski/Documents/root/lib/libCore.so] TApplication::Run(bool) (no debug info)
[/Users/djaroslawski/Documents/root/lib/libRint.so] TRint::Run(bool) (no debug info)
[/Users/djaroslawski/Documents/root/bin/root.exe] main (no debug info)
[/usr/lib/system/libdyld.dylib] start (no debug info)
Root >

HOWEVER, when I run the EXACT same program on our server, which has versions:

gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18), ROOT 6.08/07

The code works as intended when not on macOS. Can anyone explain this phenomenon please?

Thanks you so much for your time and effort.

Attached is the google drive link with the relevant files in a tarball.

https://drive.google.com/file/d/1YNfTO6KvtQFJ5QvYrPFwAckLd20aOfuS

Hi,

this is not expected. Could you share a recipe, step by step, to reproduce the problem from your code?
For example now some input files are missing. Perhaps the example can be reduced?

Cheers,
D

You can also verify that the program does not work by ‘accident’ by using valgrind on the server ( valgrind --suppressions=$ROOTSYS/etc/valgrind-root.supp root.exe -b -l - q 'iterate copy.C' )

Cheers,
Philippe.

Hello,

To execute the program I simply type:

root -l iterate.C

This works on the server’s Linux but not on my macOS machine. I have tried using less input files, it does not help the issue. (My local file is just named iterate.C without ‘copy’ but otherwise identical.)

Thank you for your reply!

Hello,

I tried that and it returned this:

valgrind --suppressions=$ROOTSYS/etc/valgrind-root.supp root.exe -b -l -q ‘iterate.C’
==761== Memcheck, a memory error detector
==761== Copyright (C) 2002-2015, and GNU GPL’d, by Julian Seward et al.
==761== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==761== Command: root.exe -b -l -q iterate.C
==761==
==761== Warning: set address range perms: large range [0x818b000, 0x18315000) (defined)
==761== Warning: set address range perms: large range [0x818b000, 0x18315000) (noaccess)
==761== Warning: set address range perms: large range [0x818b000, 0x18315000) (defined)
==761== Warning: set address range perms: large range [0x818b000, 0x18315000) (noaccess)
==761== Warning: set address range perms: large range [0x818b000, 0x18315000) (defined)
root [0]
Processing iterate.C…
file name: reco_November2017AngleScan_H0_V.root
==761== Use of uninitialised value of size 8
==761== at 0x4DA4EC9: testprog::Loop() (in /users/h2/daj111/mapsahex/testprog_C.so)
==761== by 0x4DAC2C0: ???
==761== by 0x4DAA065: ???
==761== by 0x5C9DFFE: cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5C9F05C: cling::Interpreter::EvaluateInternal(std::string const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5C9F33A: cling::Interpreter::echo(std::string const&, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5D2BE4A: cling::MetaSema::actOnxCommand(llvm::StringRef, llvm::StringRef, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5D36D40: cling::MetaParser::isXCommand(cling::MetaSema::ActionResult&, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5D37D0D: cling::MetaParser::isCommand(cling::MetaSema::ActionResult&, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5D25B02: cling::MetaProcessor::process(char const*, cling::Interpreter::CompilationResult&, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5C09165: HandleInterpreterException(cling::MetaProcessor*, char const*, cling::Interpreter::CompilationResult&, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5C19C73: TCling::ProcessLine(char const*, TInterpreter::EErrorCode*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761==
==761== Invalid write of size 4
==761== at 0x4DA4EC9: testprog::Loop() (in /users/h2/daj111/mapsahex/testprog_C.so)
==761== by 0x4DAC2C0: ???
==761== by 0x4DAA065: ???
==761== by 0x5C9DFFE: cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5C9F05C: cling::Interpreter::EvaluateInternal(std::string const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5C9F33A: cling::Interpreter::echo(std::string const&, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5D2BE4A: cling::MetaSema::actOnxCommand(llvm::StringRef, llvm::StringRef, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5D36D40: cling::MetaParser::isXCommand(cling::MetaSema::ActionResult&, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5D37D0D: cling::MetaParser::isCommand(cling::MetaSema::ActionResult&, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5D25B02: cling::MetaProcessor::process(char const*, cling::Interpreter::CompilationResult&, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5C09165: HandleInterpreterException(cling::MetaProcessor*, char const*, cling::Interpreter::CompilationResult&, cling::Value*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== by 0x5C19C73: TCling::ProcessLine(char const*, TInterpreter::EErrorCode*) (in /cvmfs/cms.cern.ch/slc6_amd64_gcc530/lcg/root/6.08.07/lib/libCling.so)
==761== Address 0xdfefef180 is not stack’d, malloc’d or (recently) free’d
==761==

*** Break *** segmentation violation
#0 0x000000003814081c in ?? ()
#1 0x00000000380dbbfb in ?? ()
#2 0x00000000380d85cb in ?? ()
#3 0x00000000380d9c7f in ?? ()
#4 0x00000000380e9297 in ?? ()
#5 0x0000000000000000 in ?? ()
Root > ==761==
==761== HEAP SUMMARY:
==761== in use at exit: 77,655,948 bytes in 163,371 blocks
==761== total heap usage: 502,148 allocs, 338,777 frees, 393,865,050 bytes allocated
==761==
==761== LEAK SUMMARY:
==761== definitely lost: 11,381 bytes in 30 blocks
==761== indirectly lost: 9,953 bytes in 112 blocks
==761== possibly lost: 414,720 bytes in 3,810 blocks
==761== still reachable: 76,913,035 bytes in 155,976 blocks
==761== of which reachable via heuristic:
==761== stdstring : 156,587 bytes in 1,168 blocks
==761== newarray : 20,512 bytes in 34 blocks
==761== multipleinheritance: 4,848 bytes in 8 blocks
==761== suppressed: 306,859 bytes in 3,443 blocks
==761== Rerun with --leak-check=full to see details of leaked memory
==761==
==761== For counts of detected and suppressed errors, rerun with: -v
==761== Use --track-origins=yes to see where uninitialised values come from
==761== ERROR SUMMARY: 14 errors from 2 contexts (suppressed: 2718 from 161)

I’m not sure what to make of it. It suggests that there is a memory leak somewhere? That is weird, because it works fine and produces the proper output files.

Can you please help me decipher this?

Thank you for your reply,

David

Try something like (note: some Valgrind functionality requires that the source code, that you want to analyse, is compiled with debug symbols):

valgrind --tool=memcheck --leak-check=full [--show-reachable=yes] [--track-origins=yes] [--num-callers=50] [--vgdb=full] --suppressions=`root-config --etcdir`/valgrind-root.supp `root-config --bindir`/root.exe -l -q 'iterate.C++g'
valgrind --tool=exp-sgcheck [--num-callers=50] [--vgdb=full] --suppressions=`root-config --etcdir`/valgrind-root.supp `root-config --bindir`/root.exe -l -q 'iterate.C++g'

and especially carefully study messages that appear in the beginning of the output.
(Note: the --show-reachable=yes option will give you too many warnings, I believe.)

BTW. The above Valgrind commands include some [something_optional] options (if you want to use any of these options, REMOVE the “square brackets”).

See also:

Hello,

Running the suggested commands yields the same error message as I posted above, in addition to VERY many messages of the type:

Warning in TClassTable::Add: class TObject already in TClassTable

but for pretty much every class.

This is behaving very weirdly; sorry, but I am very new to this.

Thank you for your reply,

David

{
  // name this ROOT macro file "RunMe.cxx" and then debug everything using:
  // valgrind --tool=memcheck --leak-check=full --suppressions=`root-config --etcdir`/valgrind-root.supp `root-config --bindir`/root.exe -b -n -q -l RunMe.cxx
  // valgrind --tool=exp-sgcheck --suppressions=`root-config --etcdir`/valgrind-root.supp `root-config --bindir`/root.exe -b -n -q -l RunMe.cxx
  // note: study valgrind's output messages from running the "iterate"
  gROOT->LoadMacro("testprog.C++g");
  std::cout << " ... starting iterate() ..." << std::endl;
  gROOT->Macro("iterate.C");
  std::cout << " ... finished iterate() ..." << std::endl;
}

Hi,

For this type of study the argument --leak-check=full should be avoided as it leads to output that are irrelevant to the problem.

Warning in TClassTable::Add: class TObject already in TClassTable

This indicates that the same library is loaded multiple time. In your case that is likely fatal and likely meant that more than one version of ROOT is setup/available in the environment variable (like LD_LIBRARY_PATH).

==761== Invalid write of size 4
==761== at 0x4DA4EC9: testprog::Loop() (in /users/h2/daj111/mapsahex/testprog_C.so)

This clearly indicates that there is a problem in testprog.C but due to the missing debug symbol, we do not yet have enough information. As Wile said, try:

gROOT->LoadMacro("testprog.C++g");

To execute the program I simply type:
root -l iterate.C

I think we do not have access to the input file and thus can not run.

Cheers,
Philippe.

Hi,

Here is a drive link to a tar file of the Data/ and Data_processed/ folders that testprog.C will take inputs from and output to.

https://drive.google.com/open?id=14H-tTuCvcmpUZ1ZqxU8YRDB2OOXjMeFE

Was not working for me either. Maybe with the input files, you guys can help me better diagnose the issue.

Thank you so much for your replies.

David

Hi David,

Note the following relevant warnings which means that even if/when it works the result are suspicious.

Info in <TMacOSXSystem::ACLiC>: creating shared library /Users/pcanal/Downloads/tarball1/./testprog_C.so
/Users/pcanal/Downloads/tarball1/./testprog.C:105:17: warning: variable 'cx' may be uninitialized when used here [-Wconditional-uninitialized]
                cx += pixmap_x[impa][ip];
                ^~
/Users/pcanal/Downloads/tarball1/./testprog.C:75:19: note: initialize the variable 'cx' to silence this warning
          float cx, cy, cz, ct, ctt;
                  ^
                   = 0.0
/Users/pcanal/Downloads/tarball1/./testprog.C:104:17: warning: variable 'cnpix' may be uninitialized when used here [-Wconditional-uninitialized]
                cnpix++;
                ^~~~~
/Users/pcanal/Downloads/tarball1/./testprog.C:76:20: note: initialize the variable 'cnpix' to silence this warning
          int cnpix, cnmpa, cpixnum;
                   ^
                    = 0
/Users/pcanal/Downloads/tarball1/./testprog.C:106:17: warning: variable 'cy' may be uninitialized when used here [-Wconditional-uninitialized]
                cy += pixmap_y[impa][ip];
                ^~
/Users/pcanal/Downloads/tarball1/./testprog.C:75:23: note: initialize the variable 'cy' to silence this warning
          float cx, cy, cz, ct, ctt;
                      ^
                       = 0.0
/Users/pcanal/Downloads/tarball1/./testprog.C:107:45: warning: variable 'ct' may be uninitialized when used here [-Wconditional-uninitialized]
                if (pixmap_time[impa][ip] < ct) ct = pixmap_time[impa][ip];
                                            ^~
/Users/pcanal/Downloads/tarball1/./testprog.C:75:31: note: initialize the variable 'ct' to silence this warning
          float cx, cy, cz, ct, ctt;
                              ^
                               = 0.0
/Users/pcanal/Downloads/tarball1/./testprog.C:108:48: warning: variable 'ctt' may be uninitialized when used here [-Wconditional-uninitialized]
                if (pixmap_Teltime[impa][ip] < ctt) ctt = pixmap_Teltime[impa][ip];
                                               ^~~
/Users/pcanal/Downloads/tarball1/./testprog.C:75:36: note: initialize the variable 'ctt' to silence this warning
          float cx, cy, cz, ct, ctt;
                                   ^
                                    = 0.0

Valgrind is only complaining about the uninitialized variable and the crash on MacOS is in a very strange place. I am investigating.

Hi,

Thank you so much for your time and help!

David

The problem is indeed a mismatch between the TTree and the code:

   Float_t         MAPSA_Pixelnumber[10];
....
   fChain->SetBranchAddress("MAPSA_Pixelnumber", &MAPSA_Pixelnumber, &b_MAPSA_Pixelnumber);

but

root [3] Tree->GetBranch("MAPSA_Pixelnumber")->Print()
*Br    0 :MAPSA_Pixelnumber : MAPSA_Pixelnumber/f                            *
*Entries :   102558 : Total  Size=     411941 bytes  File Size  =     148214 *
*Baskets :       13 : Basket Size=      32000 bytes  Compression=   2.78     *
*............................................................................*

So in the loop the code:

      //accumulate data in the event
      for (int i = 0; i < 10; ++i) {
....
            int ip =MAPSA _Pixelnumber[i] + 0.000001;

gets random value for ip (including for example -2147483648).
Consequently, the assignments:

            pixmap_mpa[im][ip] = MAPSA_MPAnumber;

write in random place in memory, leading to random behavior.

Note that after perturbing the code I saw it also crash on linux but often the random value of MAPSA _Pixelnumber[i] is close to zero which lead to ip ‘randomly’ being calculated to be zero and all appears to be fine.

Cheers,
Philippe.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.