Dear experts, i am using ROOT 6.34.04 and i am loading a pickle file from a trained XGboost model input.
I see in XGboost they have a from xgboost2tmva import convert_model
which should do a conversion to XML but i also see that ROOT itself has tutorials to follow for this conversion step.
I was wondering nowadays what is the more recommended way and if anyone has recent pointers on how to make a
XGBoost conversion to TMVA and an exmaple of using the reader of TMVA to define variables in a RDataFrame operation.
Thanks in advance,
Renato
PS: in the meanwhile i keep experimenting and report if i find a working setup.
Hi @rquaglia,
Thank you for your question.
@moneta could you please take a look?
So, knowing in advance the training variables this seems to work on ROOT 6.34
import pickle
import xgboost as xgb
from xgboost import XGBClassifier
import ROOT
import numpy as np
# Load XGBoost model
with open('BDTS_block7.pickle', 'rb') as file:
xgb_model = pickle.load(file)
features_expected = [ "B_BPVIP",
"B_BPVIPCHI2",
"B_END_VCHI2DOF",
"B_BPVDIRA",
"B_DOCA12",
"H_MINIP"]
if len(features_expected) != xgb_model.n_features_in_ :
raise ValueError("Invalid expected features to model features length")
feature_names = ['f' + str(i) for i in range(xgb_model.n_features_in_)]
print(feature_names)
ROOT.TMVA.Experimental.SaveXGBoost(xgb_model, "myModel", "output_model.root", num_inputs=len(feature_names))
bdt = ROOT.TMVA.Experimental.RBDT("myModel", "output_model.root")
df = ROOT.RDataFrame("DecayTree", "test.root")
node = df.Define("H_MINIP", "H1_BPVIP", "H2_BPVIP")
cols_input = []
for idx, c in enumerate(features_expected) :
print(idx,c)
node = node.Define( f"BDTs_Input{idx}", f"(float){c}")
cols_input.append( f"BDTs_Input{idx}")
node = node.Define("BDTv", ROOT.TMVA.Experimental.Compute[len(features_expected), float](bdt), cols_input).Define('BDTs', 'BDTv[0]')
c = ROOT.TCanvas()
h = node.Filter("BDTs>0.1").Histo1D("B_M")
h.Draw()
c.Draw()
c.SaveAs("BDTs.pdf")
but when i run it i see a lot of
cling::DynamicLibraryManager::loadLibrary(): libmkl_intel_lp64.so.2: cannot open shared object file: No such file or directory
Error in <TInterpreter::TCling::AutoLoad>: failure loading library libTMVA.so for TMVA::Experimental
cling::DynamicLibraryManager::loadLibrary(): libmkl_intel_lp64.so.2: cannot open shared object file: No such file or directory
Error in <TInterpreter::TCling::AutoLoad>: failure loading library libTMVA.so for TMVA::Experimental
cling::DynamicLibraryManager::loadLibrary(): libmkl_intel_lp64.so.2: cannot open shared object file: No such file or directory
Error in <TInterpreter::TCling::AutoLoad>: failure loading library libTMVA.so for TMVA::Experimental::Internal
cling::DynamicLibraryManager::loadLibrary(): libmkl_intel_lp64.so.2: cannot open shared object file: No such file or directory
Error in <TInterpreter::TCling::AutoLoad>: failure loading library libTMVA.so for TMVA::Experimental
cling::DynamicLibraryManager::loadLibrary(): libmkl_intel_lp64.so.2: cannot open shared object file: No such file or directory
Error in <TInterpreter::TCling::AutoLoad>: failure loading library libTMVA.so for TMVA::Experimental::Internal
cling::DynamicLibraryManager::loadLibrary(): libmkl_intel_lp64.so.2: cannot open shared object file: No such file or directory
Error in <TInterpreter::TCling::AutoLoad>: failure loading library libTMVA.so for TMVA::Experimental
cling::DynamicLibraryManager::loadLibrary(): libmkl_intel_lp64.so.2: cannot open shared object file: No such file or directory
Error in <TInterpreter::TCling::AutoLoad>: failure loading library libTMVA.so for TMVA::Experimental::Internal
An additional query , is any of the ‘reader’ or converter provided by ROOT actually working on GBReweighter from hepml ?
Is there any compatibile way to make the conversion and loading within RDataFrame operation for it? ( or can one convert it to another equivalent model which is then readable from TMVA::Experimental ?