Stack Overflow with TMVA/Cuts training

Hi,
I’m getting my feet wet with TMVA. The following code gives me a stack overflow during training. Below is the code and the log messages that come from running it. When I run in the debugger there are so many stack frames (all in libTMVA) that the debugger basically gives up.

This error happen when using the “Cuts” training (see below). If I change to, say, likelihood, then the training completes.

Am I doing something wrong here? Is it some feature of my data that might be causing this? Many thanks!

Cheers,
Gordon.

#include <iostream>

#include "TMVA/Factory.h"
#include "TFile.h"

using namespace std;

int main()
{
	cout << "Creating the factory" << endl;

	// Create the factory
	auto outputFile = TFile::Open("tmvaOutput.root", "RECREATE");
	auto factory = new TMVA::Factory("testjob", outputFile, "!V:!Silent:Color:DrawProgressBar:Transformations=I;D;P;G,D:AnalysisType=Classification");

	// Signal and background file.
	auto signalFile = TFile::Open("signal.training.root", "READ");
	auto backgroundFile = TFile::Open("background.training.root", "READ");
	auto signalTree = static_cast<TTree*>(signalFile->Get("mytree"));
	auto backgroundTree = static_cast<TTree*>(backgroundFile->Get("mytree"));

	factory->AddSignalTree(signalTree, 1.0);
	factory->AddBackgroundTree(backgroundTree, 1.0);

	// The variables
	factory->AddVariable("nTracks", "Number of Tracks", "", 'I');
	factory->AddVariable("logR", "CalRatio", "", 'F');

	// Book the straight cuts guy
	factory->BookMethod(TMVA::Types::kCuts, "Cuts",
		"!H:!V:FitMethod=MC:EffSel:SampleSize=200000:VarProp=FSmart:VarTransform=Decorrelate");
	//factory->BookMethod(
	//	TMVA::Types::kLikelihood, 
	//	"Likelihood",
	//	"H:!V:TransformOutput:PDFInterpol=Spline2:NSmoothSig[0]=20:NSmoothBkg[0]=20:NSmoothBkg[1]=10:NSmooth=1:NAvEvtPerBin=50"
	//	);

	// And do the training
	factory->TrainAllMethods();

	// And some dump output
	factory->TestAllMethods();
	factory->EvaluateAllMethods();

	// Clean up
	outputFile->Close();
	delete outputFile;
	delete factory;

	return 0;
}

Here are the messages:

PS C:\Users\gordo\Documents\Code\ROot\TMVATests\TMVASimpleStraightCutTest> ..\Release\TMVASimpleStraightCutTest.exe
Creating the factory
--- Factory                  : You are running ROOT Version: 5.34/34, Oct 2, 2015
--- Factory                  :
--- Factory                  : _/_/_/_/_/ _|      _|  _|      _|    _|_|
--- Factory                  :    _/      _|_|  _|_|  _|      _|  _|    _|
--- Factory                  :   _/       _|  _|  _|  _|      _|  _|_|_|_|
--- Factory                  :  _/        _|      _|    _|  _|    _|    _|
--- Factory                  : _/         _|      _|      _|      _|    _|
--- Factory                  :
--- Factory                  : ___________TMVA Version 4.2.0, Sep 19, 2013
--- Factory                  :
--- DataSetInfo              : Added class "Signal"      with internal class number 0
--- Factory                  : Add Tree mytree of type Signal with 5632 events
--- DataSetInfo              : Added class "Background"  with internal class number 1
--- Factory                  : Add Tree mytree of type Background with 3270550 events
--- Factory                  : Booking method: e[1mCutse[0m
--- Cuts                     : Create Transformation "Decorrelate" with events from all classes.
--- Deco                     : Transformation, Variable selection :
--- Deco                     : Input : variable 'nTracks' (index=0).   <---> Output : variable 'nTracks' (index=0).
--- Deco                     : Input : variable 'logR' (index=1).   <---> Output : variable 'logR' (index=1).
--- Cuts                     : Use optimization method: "Monte Carlo"
--- Cuts                     : Use efficiency computation method: "Event Selection"
--- Cuts                     : Use "FSmart" cuts for variable: 'nTracks'
--- Cuts                     : Use "FSmart" cuts for variable: 'logR'
--- DataSetFactory           : Splitmode is: "RANDOM" the mixmode is: "SAMEASSPLITMODE"
--- DataSetFactory           : Create training and testing trees -- looping over class "Signal" ...
--- DataSetFactory           : Weight expression for class 'Signal': ""
--- DataSetFactory           : Create training and testing trees -- looping over class "Background" ...
--- DataSetFactory           : Weight expression for class 'Background': ""
--- DataSetFactory           : Number of events in input trees (after possible flattening of arrays):
--- DataSetFactory           :     Signal          -- number of events       : 5632   / sum of weights: 5632
--- DataSetFactory           :     Background      -- number of events       : 3270550  / sum of weights: 3.27055e+006
--- DataSetFactory           :     Signal     tree -- total number of entries: 5632
--- DataSetFactory           :     Background tree -- total number of entries: 3270550
--- DataSetFactory           : Preselection: (will NOT affect number of requested training and testing events)
--- DataSetFactory           :     No preselection cuts applied on event classes
--- DataSetFactory           : Weight renormalisation mode: "EqualNumEvents": renormalises all event classes ...
--- DataSetFactory           :  such that the effective (weighted) number of events in each class is the same
--- DataSetFactory           :  (and equals the number of events (entries) given for class=0 )
--- DataSetFactory           : ... i.e. such that Sum[i=1..N_j]{w_i} = N_classA, j=classA, classB, ...
--- DataSetFactory           : ... (note that N_j is the sum of TRAINING events
--- DataSetFactory           :  ..... Testing events are not renormalised nor included in the renormalisation factor!)
--- DataSetFactory           : --> Rescale Signal     event weights by factor: 1
--- DataSetFactory           : --> Rescale Background event weights by factor: 0.00172203
--- DataSetFactory           : Number of training and testing events after rescaling:
--- DataSetFactory           : ------------------------------------------------------
--- DataSetFactory           : Signal     -- training events            : 2816 (sum of weights: 2816) - requested were 0 events
--- DataSetFactory           : Signal     -- testing events             : 2816 (sum of weights: 2816) - requested were 0 events
--- DataSetFactory           : Signal     -- training and testing events: 5632 (sum of weights: 5632)
--- DataSetFactory           : Background -- training events            : 1635275 (sum of weights: 2816) - requested were 0 events
--- DataSetFactory           : Background -- testing events             : 1635275 (sum of weights: 1.63528e+006) - requested were 0 events
--- DataSetFactory           : Background -- training and testing events: 3270550 (sum of weights: 1.63809e+006)
--- DataSetFactory           : Create internal training tree
--- DataSetFactory           : Create internal testing tree
--- DataSetInfo              : Correlation matrix (Signal):
--- DataSetInfo              : ------------------------
--- DataSetInfo              :          nTracks    logR
--- DataSetInfo              : nTracks:  +1.000  -0.057
--- DataSetInfo              :    logR:  -0.057  +1.000
--- DataSetInfo              : ------------------------
--- DataSetInfo              : Correlation matrix (Background):
--- DataSetInfo              : ------------------------
--- DataSetInfo              :          nTracks    logR
--- DataSetInfo              : nTracks:  +1.000  +0.094
--- DataSetInfo              :    logR:  +0.094  +1.000
--- DataSetInfo              : ------------------------
--- DataSetFactory           :
--- Factory                  :
--- Factory                  : current transformation string: 'I'
--- Factory                  : Create Transformation "I" with events from all classes.
--- Id                       : Transformation, Variable selection :
--- Id                       : Input : variable 'nTracks' (index=0).   <---> Output : variable 'nTracks' (index=0).
--- Id                       : Input : variable 'logR' (index=1).   <---> Output : variable 'logR' (index=1).
--- Factory                  :
--- Factory                  : current transformation string: 'D'
--- Factory                  : Create Transformation "D" with events from all classes.
--- Deco                     : Transformation, Variable selection :
--- Deco                     : Input : variable 'nTracks' (index=0).   <---> Output : variable 'nTracks' (index=0).
--- Deco                     : Input : variable 'logR' (index=1).   <---> Output : variable 'logR' (index=1).
--- Factory                  :
--- Factory                  : current transformation string: 'P'
--- Factory                  : Create Transformation "P" with events from all classes.
--- PCA                      : Transformation, Variable selection :
--- PCA                      : Input : variable 'nTracks' (index=0).   <---> Output : variable 'nTracks' (index=0).
--- PCA                      : Input : variable 'logR' (index=1).   <---> Output : variable 'logR' (index=1).
--- Factory                  :
--- Factory                  : current transformation string: 'G,D'
--- Factory                  : Create Transformation "G" with events from all classes.
--- Gauss                    : Transformation, Variable selection :
--- Gauss                    : Input : variable 'nTracks' (index=0).   <---> Output : variable 'nTracks' (index=0).
--- Gauss                    : Input : variable 'logR' (index=1).   <---> Output : variable 'logR' (index=1).
--- Factory                  : Create Transformation "D" with events from all classes.
--- Deco                     : Transformation, Variable selection :
--- Deco                     : Input : variable 'nTracks' (index=0).   <---> Output : variable 'nTracks' (index=0).
--- Deco                     : Input : variable 'logR' (index=1).   <---> Output : variable 'logR' (index=1).
--- Id                       : Preparing the Identity transformation...
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        : Variable        Mean        RMS   [        Min        Max ]
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        :  nTracks:     1.4880     2.5710   [    0.00000     28.000 ]
--- TFHandler_Factory        :     logR:    -15.690     185.45   [    -999.00     999.00 ]
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        : Plot event variables for Id
--- TFHandler_Factory        : Create scatter and profile plots in target-file directory:
--- TFHandler_Factory        : tmvaOutput.root:/InputVariables_Id/CorrelationPlots
--- Deco                     : Preparing the Decorrelation transformation...
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        : Variable        Mean        RMS   [        Min        Max ]
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        :  nTracks:    0.57849     1.0000   [  -0.018653     10.891 ]
--- TFHandler_Factory        :     logR:  -0.084576     1.0000   [    -5.3869     5.3869 ]
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        : Plot event variables for Deco
--- TFHandler_Factory        : Create scatter and profile plots in target-file directory:
--- TFHandler_Factory        : tmvaOutput.root:/InputVariables_Deco/CorrelationPlots
--- PCA                      : Preparing the Principle Component (PCA) transformation...
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        : Variable        Mean        RMS   [        Min        Max ]
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        :  nTracks:     33.296     185.45   [    -950.02     1048.0 ]
--- TFHandler_Factory        :     logR:    0.69742     2.5818   [    -25.796     3.4304 ]
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        : Plot event variables for PCA
--- TFHandler_Factory        : Create scatter and profile plots in target-file directory:
--- TFHandler_Factory        : tmvaOutput.root:/InputVariables_PCA/CorrelationPlots
--- Gauss                    : Preparing the Gaussian transformation...
--- Deco                     : Preparing the Decorrelation transformation...
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        : Variable        Mean        RMS   [        Min        Max ]
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        :  nTracks:   -0.28327     1.0000   [    -1.4102     7.0447 ]
--- TFHandler_Factory        :     logR:  -0.050603     1.0000   [    -2.7973     4.7633 ]
--- TFHandler_Factory        : -----------------------------------------------------------
--- TFHandler_Factory        : Plot event variables for Gauss_Deco
--- TFHandler_Factory        : Create scatter and profile plots in target-file directory:
--- TFHandler_Factory        : tmvaOutput.root:/InputVariables_Gauss_Deco/CorrelationPlots
--- TFHandler_Factory        :
--- TFHandler_Factory        : Ranking input variables (method unspecific)...
--- IdTransformation         : Ranking result (top variable is best ranked)
--- IdTransformation         : -----------------------------------------
--- IdTransformation         : Rank : Variable         : Separation
--- IdTransformation         : -----------------------------------------
--- IdTransformation         :    1 : Number of Tracks : 1.174e-001
--- IdTransformation         :    2 : CalRatio         : 3.174e-002
--- IdTransformation         : -----------------------------------------
--- Factory                  :
--- Factory                  : Train all methods for Classification ...
--- Factory                  : Train method: Cuts for Classification
--- Deco                     : Preparing the Decorrelation transformation...
--- TFHandler_Cuts           : -----------------------------------------------------------
--- TFHandler_Cuts           : Variable        Mean        RMS   [        Min        Max ]
--- TFHandler_Cuts           : -----------------------------------------------------------
--- TFHandler_Cuts           :  nTracks:    0.57849     1.0000   [  -0.018653     10.891 ]
--- TFHandler_Cuts           :     logR:  -0.084576     1.0000   [    -5.3869     5.3869 ]
--- TFHandler_Cuts           : -----------------------------------------------------------
--- Cuts                     : Begin training
--- TFHandler_Cuts           : -----------------------------------------------------------
--- TFHandler_Cuts           : Variable        Mean        RMS   [        Min        Max ]
--- TFHandler_Cuts           : -----------------------------------------------------------
--- TFHandler_Cuts           :  nTracks:    0.57849     1.0000   [  -0.018653     10.891 ]
--- TFHandler_Cuts           :     logR:  -0.084576     1.0000   [    -5.3869     5.3869 ]
--- TFHandler_Cuts           : -----------------------------------------------------------

Hi Gordon,

It seems that you are running and old(er) version of Root (and TMVA) on Windows. It might make sense to move up to the most recent release and see if you still have this problem. In order to debug what is going on, it would be great if you can also attach a sample of the data files you are using, so that we can attempt to reproduce the problem in linux.
Thanks,

Cheers,

Sergei

I am using the TMVA that comes with the latest available version of ROOT. Is there a way to use a more recent version of TMVA with the current version of ROOT?

I will post data files in a short while. My data definitely looks “funny” - in that it has a spike at 1000 and -1000, and everything else is concentrated between -1 and +5. (the 1000 and -1000 are flags). I will play around a little to see what happens to see if I can characterize the explosion before posting.

[quote=“sergei”]
It seems that you are running and old(er) version of Root (and TMVA) on Windows.[/quote]

I just ran in the ATLAS environment on Linux (CernVM), with root 6.

--- Factory : You are running ROOT Version: 6.02/12, Jun 24, 2015 --- Factory : --- Factory : _/_/_/_/_/ _| _| _| _| _|_| --- Factory : _/ _|_| _|_| _| _| _| _| --- Factory : _/ _| _| _| _| _| _|_|_|_| --- Factory : _/ _| _| _| _| _| _| --- Factory : _/ _| _| _| _| _| --- Factory : --- Factory : ___________TMVA Version 4.2.0, Sep 19, 2013

it looks like the exact same version of TMVA. Is this expected?

Ok - your main website (??): tmva.sourceforge.net/ - seems to claim the 4.2.0 is the most recent officially released version. That was 2013 - have there not been many updates? Or has it just been too hard to release? Or perhaps you aren’t at sourceforge anymore?

More testing. When I run under a modern version of root, with the seemingly same version of TMVA, there is no error - though the default GA does use almost 30% of the memory of my machine.

On Windows I found that if the input data is smaller then I do not hit this crash. So it seems like the stand is being abused in the first phase of the training and windows is more sensitive to it, and crys uncle first. I could increase the stack limit, but it also looks like when I “clean” the data, the problem goes away, I’ve not been motivated to follow up.

Hi Gordon,

Thank you for the feedback. In terms of TMVA versions and updates - there has been a lot of recent development (beginning in September in 2015) which is available in the latest (master) version of root. For an idea of what is new and what features are upcoming:

indico.cern.ch/event/483895/con … 03-iml.pdf

You are correct to point out that the sourceforge website needs an update. We are planning to make a major release in the upcoming months, while releasing new features for user feedback in the meantime. We will update and publicize the new TMVA version as soon as it is released. Thanks,

Best regards,

Sergei

It’s good that you have solved the problem. We have not yet touched GA, and In the upcoming work on the memory side we will have a look at this as well.

Thanks!

That link you posted was just three slides, were there meant to be more?

I would also say that a move to a more modern source control repo might be nice. :slight_smile:

Sorry, I meant to link this talk:

indico.cern.ch/event/483895/con … VA_IML.pdf

Thanks,
Best,

Sergei

Perfect, thanks!