pyROOT runtime error on M2 Max chip

Hi all,

I’ve faced a strange problem dealing with pyROOT.

I’m writing an MVA training script using RDataFrame and TMVA in pyROOT. The code snippet is as follows:

# Preprocessing input samples
rdf = ROOT.RDataFrame("tree", "file.root")

rdf = rdf.Define("z", "x + y").Filter("z > 0")
rdf.Snapshot("tree", "out.root")

# TMVA training
ROOT.TMVA.Tools.Instance()
loader = ROOT.TMVA.DataLoader("test") # everything works now

for cur_var in ["x", "y", "z"]:
    loader.AddVariable(cur_var) # *** Break *** segmentation violation

 # if comment the AddVariable lines, segmentation violation here
f = ROOT.TFile("out.root")

I tried to add breakpoints to AddVarialbe loop, and the Debugger gave the same segmentation violation. But the interesting thing is in Debugger, I can access loader with

loader.GetName() # test
loader.AddCut("test") # no error
loader.Dump() 
"""
==> Dumping object at: 0x000000017425b090, name=fold_0, class=TMVA::DataLoader
*fDataSetManager              ->60000244fd50      DSMTEST
*fDataInputHandler            ->6000076282d0      ->
fDefaultTrfs                  ->17425b1a8         list of transformations on default DataSet
fOptions                                          option string given by construction (presently only "V")
fOptions.fRep                 ->17425b1c8         ! String data
fTransformations              I                   List of transformations to test
fTransformations.fRep         ->17425b1e0         ! String data
fVerbose                      false               verbose mode
fDataAssignType               2                   flags for data assigning
fTrainAssignTree              ->17425b1f8         for each class: tmp tree if user wants to assign the events directly
fTestAssignTree               ->17425b210         for each class: tmp tree if user wants to assign the events directly
"""

And if without preprocessing input samples in the script (skip RDataFrame part), then the TMVA part works smoothly.

I’m using exactly the same script on another MacOS system (also on lxplus), which works totally fine.


Segmentation violation message

The segmentation violation error message is as follows:

*** Break *** segmentation violation
[/Applications/ROOT_6.26/install/lib/libCore.so] TUnixSystem::DispatchSignals(ESignals) (no debug info)
[/usr/lib/system/libsystem_platform.dylib] _sigtramp (no debug info)
[/Applications/ROOT_6.26/install/lib/libcppyy_backend3_9.so] WrapperCall(long, unsigned long, void*, void*, void*) (no debug info)
[/Applications/ROOT_6.26/install/lib/libcppyy3_9.so] CPyCppyy::(anonymous namespace)::VoidExecutor::Execute(long, void*, CPyCppyy::CallContext*) (no debug info)
[/Applications/ROOT_6.26/install/lib/libcppyy3_9.so] CPyCppyy::CPPMethod::ExecuteFast(void*, long, CPyCppyy::CallContext*) (no debug info)
[/Applications/ROOT_6.26/install/lib/libcppyy3_9.so] CPyCppyy::CPPMethod::ExecuteProtected(void*, long, CPyCppyy::CallContext*) (no debug info)
[/Applications/ROOT_6.26/install/lib/libcppyy3_9.so] CPyCppyy::CPPMethod::Call(CPyCppyy::CPPInstance*&, _object*, _object*, CPyCppyy::CallContext*) (no debug info)
[/Applications/ROOT_6.26/install/lib/libcppyy3_9.so] CPyCppyy::(anonymous namespace)::mp_call(CPyCppyy::CPPOverload*, _object*, _object*) (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] _PyObject_MakeTpCall (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] call_function (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] _PyEval_EvalFrameDefault (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] function_code_fastcall (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] call_function (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] _PyEval_EvalFrameDefault (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] _PyEval_EvalCode (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] _PyFunction_Vectorcall (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] call_function (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] _PyEval_EvalFrameDefault (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] _PyEval_EvalCode (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] pyrun_file (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] pyrun_simple_file (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] PyRun_SimpleFileExFlags (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] pymain_run_file (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] pymain_run_python (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] Py_RunMain (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] pymain_main (no debug info)
[/Users/avencast/miniconda3/bin/python3.9] main (no debug info)
[/usr/lib/dyld] start (no debug info)
Traceback (most recent call last):
  File "PycharmProjects/mvatoolkit/mva_toolkit_bin/process.py", line 84, in <module>
    process_fold(config=cfg, fold=0)
  File "PycharmProjects/mvatoolkit/mva_toolkit_bin/process.py", line 74, in process_fold
    train.train(config['hyper_parameters'])
  File "PycharmProjects/mvatoolkit/TMVAToolkit/TrainProcessor.py", line 38, in train
    loader.AddVariable(cur_var)
TypeError: none of the 2 overloaded methods succeeded. Full details:
  void TMVA::DataLoader::AddVariable(const TString& expression, const TString& title, const TString& unit, char type = 'F', double min = 0, double max = 0) =>
    TypeError: takes at least 3 arguments (1 given)
  void TMVA::DataLoader::AddVariable(const TString& expression, char type = 'F', double min = 0, double max = 0) =>
    SegmentationViolation: segfault in C++; program state was reset



Current Setup (not working)

ROOT Version: 6.26/10
Platform: MacOS Ventura with M2 Max
Compiler: Apple clang version 14.0.0 (clang-1400.0.29.202)
Python Version: Python 3.9.11

Previous Setup (works fine)

ROOT Version: 6.26/10
Platform: MacOS Big Sur with intel
Compiler: Apple clang version 12.0.5 (clang-1205.0.22.11)
Python Version: Python 3.9.12


Hi @Avencast ,

hard to tell what’s going wrong, it looks like the Python stacktrace is just propagating a SIGSEV signal from the inner C++. So we would need to reproduce this with a build of ROOT with C++ debug symbols in order to see more clearly.

If I understand correctly you don’t need TMVA in the program in order to observe the crash, so I guess we can remove it from the equation in order to simplify the reproducer.

If you can provide a recipe to reproduce the crash on lxplus.cern.ch or in a Docker container (or, ideally, on my machine! :smiley: ) I would be happy to take a look.

Cheers,
Enrico

Hi @eguiraud ,

Thanks for your reply. I know it’s hard to understand the behavior here. The most difficult part is I cannot reproduce the error on other platforms using the exact same code. I will try to open debug for ROOT and see if I can find anything there.

Just want to confirm that to open debug information, I just need to add -DCMAKE_BUILD_TYPE=Debug for cmake?

Best,
Yulei

Yes, when compiling ROOT from source, that cmake configuration option will guarantee that ROOT is built with debug symbols.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.