Variables add via the DataLoader AddVariable() method are all zero

I’m attempting to use the TMVA class, but no matter what variables from the trees I add, it’s reading 0 for every event. The values are stored as doubles, which from what I’ve gathered can make a difference, but I’m not sure. I’m sure it’s something simple like that but I just can’t track it down. Any help would be appreciated.

import ROOT
import json

def addTrees(dataLoader):
    with open('jsons/plottingInfo.json','r') as f:
        plottingInfo = json.load(f)
        for sampleName, sampleInfo in plottingInfo.items():
            for subsampleName, subsampleInfo in sampleInfo['subsamples'].items():
                #Use partial data for faster testing and debugging
                if subsampleName not in ['dy_HT_70_100','qcd_HT_100_200_benriched']:
                    continue
                subsampleFile = ROOT.TFile.Open(f'rootfiles/{subsampleName}_18.root')
                subsampleTree = subsampleFile.Get('output_tree')

                numSkimmedEvents = subsampleInfo['2018']['Skimmed']
                numUnskimmedEvents = subsampleInfo['2018']['Unskimmed']
                skimmingEfficiency = numSkimmedEvents/numUnskimmedEvents
                crossSection = subsampleInfo["cross_section"]
                subsampleWeight = crossSection*skimmingEfficiency

                if sampleName == "QCD":
                    dataLoader.AddBackgroundTree(subsampleTree, subsampleWeight)
                else:
                    dataLoader.AddSignalTree(subsampleTree, subsampleWeight)

                subsampleTree.SetDirectory(0)
                subsampleFile.Close()


def getNetVariables():
    variables = [
        "numGoodJets",
        "eventLeptonPt"
    ]

    return variables

def runMva():
    outputFile = ROOT.TFile.Open('outputs/mva.root','RECREATE')

    factory = ROOT.TMVA.Factory('mva',outputFile,'')
    dataLoader = ROOT.TMVA.DataLoader('dataset')

    netVariables = getNetVariables()
    for var in netVariables:
        dataLoader.AddVariable(var)

    addTrees(dataLoader)

    # dataLoader.PrepareTrainingAndTestTree(ROOT.TCut('numGoodJets >= 2'), 'nTrain_Signal=4000:nTrain_Background=4000:SplitMode=Random:!V')

    # factory.BookMethod(dataLoader, ROOT.TMVA.Types.kDL, 'kDL', '')
    factory.BookMethod(dataLoader, ROOT.TMVA.Types.kLD, 'kLD', '')
    # factory.BookMethod(dataLoader, ROOT.TMVA.Types.kPyTorch, 'PyTorch')
    
    factory.TrainAllMethods()
    factory.TestAllMethods()
    factory.EvaluateAllMethods()

if __name__ == '__main__':
    ROOT.gSystem.Setenv("OMP_NUM_THREADS", "6")
    runMva()

Maybe @moneta can help

A small update, I’ve found if I draw the variables from the trees before adding them to the data loader, then some entries have non zero values, but most are still zero

Hi,
The variables in the TTree can be stored as double, TMVA should be able to convert internally to float.
If you are still having an issue here, please post the full running code including the input file, so we can reproduce and investigate the issue

Lorenzo