Handling missing values in the TMVA

Dear ROOT experts,

Is there a way to automatically treat missing values for the input variables/features in the TMVA algorithms? An example could be a third jet variable in events that pass N(jets) >= 2 requirement.

There’s a 2-year old topic on this forum that says TMVA only supports padding the missing variable values with unphysical values. But in that topic there’s still no proof if this is a viable idea. Have anything changed since then? Does TMVA support the new ways to handle this issue?

Thanks in advance,
Aleksandr

Hi,

As mentioned in the old topic the only thing TMVA support is padding the missing values.
It is true this is maybe not a so viable idea for algorithms such as decision trees, but this is a feasible option you can do for example when using a recurrent neural network.
There are examples in HEP of using successfully recurrent networks such as LSTM and GRU with variable particle sequences.
Note that now TMVA supports recurrent networks working on both on CPU and GPU.

Best regards

Lorenzo