TMVA training with decreasing number of events

mwojtas · January 18, 2017, 10:23am

Dear experts,

So I was trying to train several neural networks at once, but it seems that after training the first one (MLP_10) on around 39k events, all the next ones (MLP_11…MLP_20) are trained on only 3.9k events. I attach a screenshot of a training log. Regarding the code, I add training and testing events one by one after the selection process:

if(isigA == 0 ) 
	 {
	   if( aTrainOption ) 
	     {
	       aTmvaFactory ->AddSignalTrainingEvent( m_TmvaTrainDVars, GetWeight());
	     }
	   else 
	     {      
	       aTmvaFactory ->AddSignalTestEvent( m_TmvaTrainDVars, GetWeight());
	     }   
	 } 
else 
	 {    
	   if( aTrainOption )  
	     {
	       aTmvaFactory ->AddBackgroundTrainingEvent( m_TmvaTrainDVars, aLBG-> NominalBackgrEventWeight());
	     }
	   else
	     {      
	       aTmvaFactory ->AddBackgroundTestEvent( m_TmvaTrainDVars, aLBG-> NominalBackgrEventWeight());
	     }      
	 }

(weights are equal to one in this case)
and prepare the trees using:

aTmvaFactory->PrepareTrainingAndTestTree( cuts,"nTrain_Signal=0:nTrain_Background=0:nTest_Signal=0:nTest_Background=0:SplitMode=Random:NormMode=NumEvents:V" );
the MLP’s are booked in a for loop:

for (int i = 0; i<10; ++i)
	{
	  number += i;
	  aTmvaFactory->BookMethod(TMVA::Types::kMLP,"MLP_1"+number,"!H:!V:ConvergenceImprove=1e-4:ConvergenceTests=20:TestRate=5:Sampling=0.1:SamplingEpoch=100:SamplingImportance=2:Tau=3:HiddenLayers=60,40:VarTransform= G,D,G,Norm:NCycles= 300 :NeuronType= sigmoid:TrainingMethod=BFGS :UseRegulator=False:EstimatorType=MSE:RandomSeed=0");
	  number="";
	}

Can you help me with increasing the number of events for other MLP’s?

moneta · January 27, 2017, 3:33pm

Hi,

This is strange. Can you please post a minimal macro reproducing this problem,

Thank you

Lorenzo

mwojtas · February 3, 2017, 4:11pm

Hello,

First of all, thank you for your answer. i attach an input file and a modified TMVA classification tutorial to reproduce the mistake. It seems that if x is the proper number of training events, the consecutive MLP’s create this pattern when the booking loop is run:

MLP || train events || evaluation events
0 || x || x/10
1 || x/10 || x/10
2 || x/10 || x
3 || x || x

See more details in the code. I am using the most recent root version - 6.06/02.

Cheers,
Maks
InputFile.root (1.57 MB)
TMVAClassification.C (11.1 KB)

moneta · February 6, 2017, 10:49am

HI,

Thank you for posting your code. I see you are using the Sampling option with value =0.1. This means only 10% of events are used for training. See tmva.sourceforge.net/optionRef.html#MVA::MLP

Probably this affects then the calling of MLP a second time and I think this is a bug. As a workaround I would suggest you remove this option if you can.

Cheers

Lorenzo

mwojtas · February 6, 2017, 3:58pm

Hello,

Thank you for your answer - now everything works fine. I think it messed up all the MLP’s including the first one, since their training times went from ~20 Min to a couple of hours. Also the convergence tests were not working properly - the errors/uncertainties were huge.

Cheers,
Maks