Variables order

Amilkar · July 24, 2017, 6:40pm

I am using the macro $ROOTSYS/tutorials/tmva/TMVAClassification.C
Running: root.exe TMVAClassification.C(“Cuts”)
The output:
— Cuts : Cut values for requested signal efficiency: 0.9
— Cuts : Corresponding background efficiency : 0.585679
— Cuts : Transformation applied to input variables : None
— Cuts : ------------------------------------------
— Cuts : Cut[ 0]: -2.28765 < myvar1 <= 1e+30
— Cuts : Cut[ 1]: -1e+30 < myvar2 <= 2.87555
— Cuts : Cut[ 2]: -3.1227 < var3 <= 1e+30
— Cuts : Cut[ 3]: -0.668634 < var4 <= 1e+30
— Cuts : ------------------------------------------

-However is I change the order of the variables (line 191) from var4, var3, var1, var2; I get a different set of cuts:
— Cuts : Cut values for requested signal efficiency: 0.9
— Cuts : Corresponding background efficiency : 0.588794
— Cuts : Transformation applied to input variables : None
— Cuts : ------------------------------------------
— Cuts : Cut[ 0]: -2.09864 < var3 <= 1e+30
— Cuts : Cut[ 1]: -0.67248 < var4 <= 1e+30
— Cuts : Cut[ 2]: -7.27392 < myvar1 <= 1e+30
— Cuts : Cut[ 3]: -1e+30 < myvar2 <= 2.6745
— Cuts : ------------------------------------------

-I am doing my own analysis but this behavior is the same in the tutorial. At the end I will use the combination with the highest efficiency but I have 6 variables. How can I know the correct ordering without running the code 6! times?

P.S. the “— IdTransformation : Ranking result (top variable is best ranked)”, is the same for both of them.

behrenhoff · July 24, 2017, 9:46pm

Can you paste the a few lines of code around the line you are referring to? What did you change exactly? In ROOT master, lines 189 - 199 are empty or comments and don’t contain any code.

Let me guess: did you see that myvar1 and myvar2 are defined as var1+var2 and var1-var2? Could it be that you changed the definitions of var* but not myvar*?

However, I get completely different numbers with master:

% root ~/src/root/root-head/tutorials/tmva/TMVAClassification.C'("Cuts")' -b -q \
 | grep -A7 'Cut values for requested signal efficiency: 0.9'
[TFile::Cp] Total 0.20 MB       |====================| 100.00 % [1.0 MB/s]
Info in <TFile::OpenFromCache>: using local cache copy of http://root.cern.ch/files/tmva_class_example.root [./files/tmva_class_example.root]
Cuts                     : Cut values for requested signal efficiency: 0.9
                         : Corresponding background efficiency       : 0.501033
                         : Transformation applied to input variables : None
                         : ------------------------------------------
                         : Cut[ 0]:   -7.95481 < myvar1 <=      1e+30
                         : Cut[ 1]:     -1e+30 < myvar2 <=    2.87258
                         : Cut[ 2]:   -3.51779 <   var3 <=      1e+30
                         : Cut[ 3]:  -0.564128 <   var4 <=      1e+30

Amilkar · July 24, 2017, 10:22pm

Hello Wolf:

Line 191-194 of the code are:

factory->AddVariable( “myvar1 := var1+var2”, ‘F’ );
factory->AddVariable( “myvar2 := var1-var2”, “Expression 2”, “”, ‘F’ );
factory->AddVariable( “var3”, “Variable 3”, “units”, ‘F’ );
factory->AddVariable( “var4”, “Variable 4”, “units”, ‘F’ );
If you move the first two lines below, like:

factory->AddVariable( “var3”, “Variable 3”, “units”, ‘F’ );
factory->AddVariable( “var4”, “Variable 4”, “units”, ‘F’ );
factory->AddVariable( “myvar1 := var1+var2”, ‘F’ );
factory->AddVariable( “myvar2 := var1-var2”, “Expression 2”, “”, ‘F’ );
The results of the cuts are different.

BTW: I am using root 6.05, but root 5.34 gives similar result with my analysis code.

Let me know if you need more details.

behrenhoff · July 25, 2017, 12:01am

Happens with current ROOT master as well.
I don’t know the exact implementation details but the “Cuts” algorithm doesn’t work with correlated input variables. Does the same problem occur when you decorrelate first? (i.e. add VarTransform=D)

Amilkar · July 25, 2017, 1:13pm

I tried “CutsD”, and still the same effect