Dear ROOT experts,
I want to implement the permutation feature importance estimation algorithm for the BDT model created with the TMVA package. I see three ways of doing it, but all have some issues.
The first would be to transform the TMVA BDT model into a
sklearn-compatible model and simply using the function I’ve linked to earlier. Unfortunately I haven’t found any information on how one can do such transformation (only the other way around,
sklearn to TMVA). Is it possible to do so?
The second would be to use the
RBDT ability to make inference on
RDataFrame (as seen in this tutorial). Exactly:
- Create an input
- Convert it into
- Shuffle one row using
- Convert it back to
- Calculate the model response.
However I haven’t found any documentation on how to use the
RBDT interface on an old-school TMVA Gradient Boosted Decision Trees (I’m using ROOT 6.16). Is there a way to use it?
The third is the most straightforward one. Just do the it like in (2), but write the shuffled RDataFrame in a file and use the classic way to calculate the model response in a loop over the events in a tree (like in this tutorial). But this way seems to be way roundabout and too dependent on the I/O speed of the local storage.
Could you suggest a way to better implement this algorithm?
Thanks in advance,