Conceptual problem on regression

Dear experts,
I am trying to understand the workflow of calibration by regression, as distinct from parameter extraction from classifier. In the latter, after the MVA responses have been found for both simulated and measured data, one can pass these responses to a parameter extraction routine. Thus the manual reads: “The output of a classifier may then… enter a subsequent maximum-likelihood fit, or similar.”

For the former, the manual reads “The output of a regression method could be directly used for example as energy estimate for a calorimeter cluster as a function of the cell energies.”

But in calibration of such a cluster, one still needs to find parameters in the final step–namely the gains on the cells.

To be completely concrete: for classification:

  1. pass variables to tmva to train, test and apply mva method of choice
  2. pass the responses from tmva to a roofit macro that:
  3. compares the response for the simulated data to that of the measured data, and extracts the parameters.

For calibration of calorimeter by regression:

  1. pass electromagnetic cell energies (whose weighted sum is the value of the cluster) to train, test, apply mva.
  2. pass response of mva to a macro that:
  3. ???

Many thanks, and I hope this is not too basic.

Also, I am aware that one could call tmva within each run and use the output as the actual energy response. I wonder if I can extract the array of parameters that gives that response, rather than incurring the overhead of calling tmva for each data run. Or would this discard energy-dependency of the calibration?

Hi,

In the second case you don’t have step 3. Using a MVA method you estimate the response function (e.g. the energy response) given some input variables.
See en.wikipedia.org/wiki/Regression_analysis

Lorenzo