Nan entries in the dataset

Hi,
I was running the classification macro with my dataset that has ‘nan’ entries. And I got the error as in below:

<FATAL>                         : How am I supposed to train a NaN or +-inf?!

So, can’t a ‘nan’ value be passed through the DataLoader in TMVA? I’m not much experienced with TMVA. Can you suggest to me how should I deal with the ‘nan’ values in TMVA?

Regards,
Saumyen

1 Like

Hi,

You should remove the events containing a NaN with a cut. For example when calling DataLoader::PrepareTrainingAndTestTree you can pass cuts for signal and background events. There you should exclude events containig NaN, using for example the TMath function TMath::IsNaN(var)

Lorenzo

Thank you so much Lorenzo for the reply and suggestion.
So, these are the concerned lines that you suggested a modification.

   TCut preselectionCut = ""; 

   // To also specify the number of testing events, use:
   dataloader->PrepareTrainingAndTestTree( preselectionCut, "nTrain_Signal=6000:nTrain_Background=6000:SplitMode=Random:NormMode=NumEvents:!V" );

Now how should I put the cut, I mean the command? Is it like whichever variables I am taking for those should I put

   TCut preselectionCut = "TMath::IsNaN(PT_l) = False" ;

if PT_l is one of the variables?

Regards,
Saumyen

1 Like

The cut should be defined as following (supposing PT_1 and PT_2 are the variables having NaN):

TCut preselectionCut = "!TMath::IsNaN(PT_1) && !TMath::IsNaN(PT_2)"

Lorenzo

Thank you so much Lorenzo. It worked perfectly.

Thanks a lot,
Saumyen

For me even putting cuts such that the values are not NaN nor inf. dose not remove all of them, instead one can just put a cuts such that these variable with the problem within ] 0 , 1000 [ for example