[ROOT::Fitter] how to use Huber loss function for linear fitting

Hi,

I’m using ROOT::Fit::Fitter class to do a linear regression. Here is a minimum example (using PYROOT):

import ROOT
import numpy as np

fitter = ROOT.Fit.Fitter()
func = ROOT.TF1("f1", "pol1", 0, 130)
fitter.SetFunction(ROOT.Math.WrappedMultiTF1(func, func.GetNdim()), True)
fitter.Config().SetMinimizer("Linear")

x = np.array([67.5, 67.5, 47.5, 47.5, 57.5])
x_err = np.array([2.5, 2.5, 2.5, 2.5, 2.5])
y = np.array([107.5,  102.5, -127.5,  117.5,  112.5])
y_err = np.array([2.5, 2.5, 2.5, 2.5, 2.5])
bin_data = ROOT.Fit.BinData(len(x), x, y, x_err, y_err)

fitter.Config().ParSettings(0).SetValue(0.)
fitter.Config().ParSettings(1).SetValue(1.)
is_ok = fitter.Fit(bin_data)
print(f"is ok: {is_ok}")
if is_ok:
    res = fitter.Result()
    print(f"slope: {res.Parameter(1)}")
    print(f"offset: {res.Parameter(0)}")
    print(f"pvalue: {res.Prob()}")

The printout is:

is ok: True
slope: 5.500000000000007
offset: -253.7500000000004
pvalue: 1.4567546900600512e-36

However, the dataset usually contains some outliers (including the dataset in the example above), which makes the final fitting result quite bad. If I use scikit-learn library, this could be resolved by using Huber regression, which “truncates” the implications from the outliers. Here is an example:

import numpy as np
from sklearn.linear_model import HuberRegressor, LinearRegression

x = np.array([67.5, 67.5, 47.5, 47.5, 57.5])
x_err = np.array([2.5, 2.5, 2.5, 2.5, 2.5])
y = np.array([107.5,  102.5, -127.5,  117.5,  112.5])
y_err = np.array([2.5, 2.5, 2.5, 2.5, 2.5])

huber_res = HuberRegressor().fit(np.reshape(x, (-1, 1)), y)
print(f"coef: {huber_res.coef_}, offset: {huber_res.intercept_}")

The printout is:

coef: [-0.46186719], offset: 136.75230379098608

Here is a comparison between the two methods (red line is using huber loss function and blue line is using ROOT’s normal linear fitting).

The huber loss function yields a much better result.

Thus, I would like to know whether I could use ROOT to do the huber regression as we need to do this with C++ in our project.

Thanks for your attention