Home | News | Documentation | Download

Variable importances don't get printed when using CrossValidation

Dear experts,

I have noticed that the variable importances (both method unspecific and method specific) aren’t being printed out by TMVA when using CrossValidation. Is there a reason for this?

I have tried training both with and without FoldFileOutput=True in the CrossValidation constructor, and I don’t see the variable importances in either.

I am using ROOT 6.22.06 on lxplus, invoked by
source /cvmfs/sft.cern.ch/lcg/app/releases/ROOT/6.22.06/x86_64-centos7-gcc48-opt/bin/thisroot.sh

Thanks,
Arvind.

Hi,

If I remember correctly this is because the variable importances are calculated on the training set and using k-folds cross-validation there is not a singular training set anymore. How to merge the importances from different model trained on different data is not straight forward (to my knowledge) and it was left out.

Just printing the importances for each individual fold and reporting them in the per-fold output should be possible but I would not be able to judge the difficulty of implementation. Maybe @moneta can provide some insight?

Cheers,
Kim

1 Like

Hi,

I think it should not be difficult to add the printing of the variable ranking information for each fold when doing Cross Validation. If you think this it will be useful, can you please open an GitHub issue for this?
Thank you

Lorenzo

Done, https://github.com/root-project/root/issues/7092

Thanks!,
Arvind.

Great, Thank you !!!