BDT method with XGBoost and TMVA

I am trying to use the XGBoost to perform BDT analysis of signal and background for a set of final states. My code file performs well in some sets of sample events. Whereas in some cases, I get stuck with the line that ‘The kernel appears to have died. It will restart automatically.’ This is happening although the dataset size is small (less than 20 MB). Whereas, if I use the same dataset in TMVA BDT analysis, I face no error. Is there some problem with my dataset, in particular? Could you help me in understanding, why this is happening despite the file size being small and running in TMVA without any error?

I attach herewith the datafiles used. Signal files are :
cutflow_signunuh.root (total signal sample)
sigTrainData.root (signal training sample)
sigTestData.root (signal testing sample)
cutflow_signunuh.root (830.1 KB)
sigTestData.root (417.0 KB)
sigTrainData.root (420.5 KB)

Background samples are :
cutflow_bkgnunuh.root (788.6 KB)
bkgTestData.root (397.1 KB)
bkgTrainData.root (399.5 KB)

cutflow_bkgnunuh.root (total background sample)
bkgTrainData.root (background training sample)
bkgTestData.root (background testing sample)

Any help would be highly appreciated.

Thanks and regards,
Antara

Hi,
This is maybe more a problem in XGBoost, but I could have a look at it. I would need also your code that you use for training the model and shows the error you have reported

Lorenzo

Thanks for the reply.

I attach herewith the codes I am using. I am using XGBoost v1.2 . Please have a look and kindly help me to be make it to working for the analysis.
tmva_ee2vvh_train_sigbkg.cpp (11.5 KB)
Untitled33.ipynb (6.0 KB)

Thanks and regards,
Antara

Hi,

It was to ask if you checked the code and came across similar error or not. I am confused of whether it is happening due to the dataset or due to any conflicting dependencies for xgboost. This code is running fine with some other datasets but not for several other datasets. I tried uninstalling existing required packages for the BDT analysis and downgrading the versions, still the same error ‘The kernel appears to have died.’ is coming. Please tell what to do in this regard?

Thanks
Antara

Hi,

I think your notebook using XGBoost is having some issues and gives some error in XGBoost causing the Kernel to die. I have found one error, th enables y you pass ar not the 7th column of your data , but the 14th, Bit after fixing this I have still an error when calling model.predict_proba(X_test). I am not expert in XGBoost and I don’t know what the problem is.
I would suggest you using TMVA, any special reason you need to use XGBoost instead of TMVA ?

Lorenzo