Definition of vector in python to call TMVA functions

Hello, I’m trying to run TMVA on my python code but I’m having some problems when sending arguments to TMVA functions because of the definition of the variables. When I write dataloader.AddSignalTestEvent(tvars,event.weight), I do not know how to define tvars in order to make the line work. The following error constantly appears

TypeError: void TMVA::DataLoader::AddSignalTestEvent(const vector& event, double weight = 1.) =>
could not convert argument 1

I have tried to send a list of floats, a numpy array and a string, but I don’t know what else can I try.

Thanks for the help!

Hi @Delegido

Unfortunately those conversions you tried (python list, numpy array) are not yet available for vectors. So you need to pass exactly what the function is expecting, which is an std::vector of doubles:

AddSignalTestEvent (const std::vector< Double_t > &event, Double_t weight=1.0)

You can create an std::vector from python like this:

v = ROOT.std.vector('double')()

and then you can use vector methods to add new elements.

Hi @etejedor

Thank you for the answer. That was helpful because now the previous error does not appear any more. Unfortunately, the code now crashes. The code have used

eventlist = get_event(event, sig_mass)
v = ROOT.std.vector('double')()
for i in range(len(eventlist)):
        v.push_back(eventlist[i])

gRandom = ROOT.TRandom3()
gRandom.SetSeed(event.eventNumber) 
n_entry = gRandom.Rndm()

if (n_entry%2):
        dataloader.AddSignalTestEvent(v,1.0)

The error that appears in the crash is the following

SystemError: void TMVA::DataLoader::AddSignalTestEvent(const vector<double>& event, double weight = 1.) =>
    problem in C++; program state has been reset

Do you know what might be the problem? I have checked the input of the double vector and it does not have any strange thing. It it only a double vector.

Thank you for the help

Hi @Delegido

I think that issue is related to TMVA, if I run this in C++:

TMVA::DataLoader d("dataset");
std::vector<double> v;
v.push_back(1.);
v.push_back(2.);
d.AddSignalTrainingEvent(v, 1.0);

I get the same error.

Perhaps @kialbert , @swunsch, @moneta can comment?

Hi,

Try calling d.AddVariable(”x”); first so that memory will be allocated for the event first.

(This should give an appropriate error message instead of crashing, thanks for reporting. I will add a JIRA ticket.)

(Finally, please, for c++, use non-managed heap allocated memory for the dataloader since tmva, through the factory, implicitly takes ownership of it. I.e. use new DataLoader{})

Sorry for my brevity. Cheers,
Kim

Edit:

For reference, here is a working setup (albeit with a very simple dataset) in python:

import ROOT

d = ROOT.TMVA.DataLoader("dataset")
d.AddVariable("x")
d.AddVariable("y")

for i in range(10):
    # vector has 2 entries, first entry corresponding to variable x,
    # second to variable y.
    v = ROOT.std.vector('double')()
    v.push_back(i*1.0)
    v.push_back(i*2.0)
    d.AddSignalTrainingEvent(v)
    d.AddSignalTestEvent(v)
    
    v2 = ROOT.std.vector('double')()
    v2.push_back(-i*1.0)
    v2.push_back(-i*2.0)
    d.AddBackgroundTrainingEvent(v2)
    d.AddBackgroundTestEvent(v2)

d.PrepareTrainingAndTestTree(ROOT.TCut(""), "")

Thank you!! Now it is working