A root file produced by RooClassFactory::makePdf() Crashed

Hi, Rooters,

When I run this script CountingModel.C which copied from this website, twiki.cern.ch/twiki/bin/view/Ro … del_and_fi .
The produced root file, CoutringModel.root, can be opened without any problem.

Because the model I’m trying to apply can’t be expressed by the functions Roofit provided.
So, I have to build a model with a C++ script.
With the help of the script of “rf104_classfactory.C”, after some struggling,
my scripts worked finally : produced a root file.
However, the problems of the root file are
1). it can’t be used to do a hyper test by applying “SimpleHypoTestInv.C”.
2). it can’t be opened normally as the "CoutringModel.root ".
It crashed, and the error message is something like :

~~~~~~~~~~~~~~~~~~ Beginning of error message ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Warning in TClass::TClass: no dictionary for class BkgLinearPdf is available
Warning in TClass::TClass: no dictionary for class OOnePdf is available
root [1] TBrowser a
root [2] Error in TBufferFile::ReadObject: trying to read an emulated class (BkgLinearPdf) to store in a compiled pointer (TObject)

*** Break *** segmentation violation

===========================================================
There was a crash.
This is the entire stack trace of all threads:

~~~~~~~~~~~~~~~~~~ End of error message ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

After checking around here, I found similar cases while haven’t been really solved.
[Proof crashing with autoImportClassCode
[root.cern.ch/phpBB3/viewtopic.p … Pdf#p74299](Compiling and using a custom RooFit class

Any expert in this community, could you please take a close look ?
Thanks !

For your reference, I attached my script here : it produce root file normally, but the root file can’t be opened normally.

#include "TTree.h"

using namespace RooFit;
using namespace RooStats;

void CountingModel(  int nsig = 1,         // number of signal events .
                     int nbkg = 169)       // number of background events.
{
   RooWorkspace w("w");

   ///~~~~~~~~~~~~~~~~~~  Begining of creating a model p.d.f. ~~~~~~~~~~~~~~///

  RooClassFactory::makePdf("BkgLinearPdf","x,A,B") ;
#ifdef __CINT__
  gROOT->ProcessLineSync(".x BkgLinearPdf.cxx+") ;
#endif


  RooClassFactory::makePdf("OOnePdf","x,cof") ;
#ifdef __CINT__
  gROOT->ProcessLineSync(".x OOnePdf.cxx+") ;
#endif


   w.factory("BkgLinearPdf:bkg_pdf(x[0,10], a[2.5,1.0,30.0], b[-0.0125])");/// a,interceipt; b, slope.

   ///w.factory("OOnePdf:sig_pdf(x, mass[2], sigma[0.3])");
   w.factory("OOnePdf:sig_pdf(x[0,10],cof[1.0,0.01,100.0])");

   // Define the fraction of signal .
   /// RooRealVar fsig("fsig","signal fraction",0.001,0.,1.) ;
   ///w.factory("SUM:model(fsig[0,1.0] * sig_pdf, (1 - fsig[0,1.0]) * bkg_pdf)");  // for extended model

   w.factory("SUM:model(nsig[0,170]*sig_pdf, nbkg[0,170]*bkg_pdf)");  // for extended model
                            ///[0,1000] is the range of nsig and nbkg. their values assigned above.

   ///~~~~~~~~~~~~~~~~~~  End of creating a model p.d.f. ~~~~~~~~~~~~~~///

   
   ///~~~~~~~  Begining of creating the ModelConfig object and set "b0" as a global observable ~~~///

  /// w.var("a")->setVal(a); /// this line shouldn't be necessary for our usage. 
   w.var("a")->setConstant(true); // needed for being treated as global observables
   ///w.var("sigmab")->setVal(sigmab*b);  
   
   ModelConfig mc("ModelConfig",&w);/// declare the name and reference of ModelConfig as "mc" and "ModelConfig".
   mc.SetPdf(*w.pdf("model")); /// The pdf of mc is the "model(pdf, constraint)" from w factory.
   mc.SetParametersOfInterest(*w.var("cof")); /// For ModelConfig, the POI is the "s" in w factory.
   mc.SetObservables(*w.var("x"));  /// For ModelConfig, the observables is the "nobs" in w factory.
   ///mc.SetNuisanceParameters(*w.var("a"),"nbkg" );/// For ModelConfig, the NP is the "b" in w factory.

   w.defineSet("nuisParams","a,nbkg");
   mc.SetNuisanceParameters(*w.set("nuisParams") );/// For ModelConfig, the NP is the "b" in w factory.

   // these are needed for the hypothesis tests
   mc.SetSnapshot(*w.var("cof"));
   // need now to set the global observable
   mc.SetGlobalObservables(*w.var("a0"));

   mc.Print();
   // import model in the workspace 
   w.import(mc);

   
   ///~~~~~~~  End of creating the ModelConfig object and set "b0" as a global observable ~~~///

  // import a tree
  TFile *treefile = TFile::Open("measuredEnergyTree.root");
  TTree* tree = (TTree*) treefile->Get("ntuple");
   
  // Define observables x
  RooRealVar x("x","x",1,0,11) ;

  // Import only observables (x)
  RooDataSet ds("ds","ds",RooArgSet(x),Import(*tree)) ;
  ds.Print() ;
  ds.add( RooArgSet(x));


  RooPlot* frame3 = x.frame(Title("import data from a tree")) ;
  ds.plotOn(frame3) ;

  TCanvas* c = new TCanvas("import data from a tree","import data from a tree ",800,800) ;
   gPad->SetLeftMargin(0.15) ; frame3->GetYaxis()->SetTitleOffset(1.4) ; frame3->Draw() ;


   w.import(ds);

   w.Print();

   TString fileName = "CountingModel.root"; 

   // write workspace in the file (recreate file if already existing)
   w.writeToFile(fileName, true);

}

Thanks !

Best,
Junhui

Hi,

In order to read the ROOT file you need to generate the dictionary for those classes. For example, if you have tour class BkgLinearPdf implemented in the BkgLinearPdf.cxx file,
It is enough that you do before running the RooStats macro (i.e. before opening the file)
root[] .L BkgLinearPdf.cxx+

Best Regards

Lorenzo

Hi, Lorentz,

Thanks for reply !

I’ve tried your suggestion.
It worked, :smiley: .

However, another even more mysterious problem happened, :frowning:
There is no any error message, just quitted and no plot have been produced.
It’s shown in [2] for a frequentist calculator; in [3] for an asymptotic calculator.
I guess the whole script hasn’t been finished yet because the last line contains “ToyMCSampler” which probably means it quitted after this line.

Do you have any comments ?

Thanks !

Best,
Junhui

[2]
root [0] .L BkgLinearPdf.cxx+

RooFit v3.56 – Developed by Wouter Verkerke and David Kirkby
Copyright © 2000-2013 NIKHEF, University of California & Stanford University
All rights reserved, please read roofit.sourceforge.net/license.txt

root [1] .L OOnePdf.cxx+
root [2] .L SimpleHypoTestInv.C
root [3] SimpleHypoTestInv(“CountingModel.root”,“w”,“ModelConfig”,“ds”)
[#1] INFO:InputArguments – HypoTestInverter ---- Input models:
using as S+B (null) model : ModelConfig
using as B (alternate) model : ModelConfig_with_poi_0

Doing a fixed scan in interval : 0 , 170
[#1] INFO:Eval – HypoTestInverter::GetInterval - run a fixed scan
[#1] INFO:ObjectHandling – RooWorkspace::saveSnaphot(w) replacing previous snapshot with name ModelConfig__snapshot

=== Using the following for ModelConfig ===
Observables: RooArgSet:: = (x)
Parameters of Interest: RooArgSet:: = (nsig)
Nuisance Parameters: RooArgSet:: = (a,cof,nbkg)
Global Observables: RooArgSet:: = (a)
PDF: RooAddPdf::model[ nsig * sig_pdf + nbkg * bkg_pdf ] = 2.4375
Snapshot:

  1. 0x30d6c40 RooRealVar:: nsig = 0 L(0 - 170) “nsig”

=== Using the following for ModelConfig_with_poi_0 ===
Observables: RooArgSet:: = (x)
Parameters of Interest: RooArgSet:: = (nsig)
Nuisance Parameters: RooArgSet:: = (a,cof,nbkg)
Global Observables: RooArgSet:: = (a)
PDF: RooAddPdf::model[ nsig * sig_pdf + nbkg * bkg_pdf ] = 2.4375
Snapshot:

  1. 0x30dc640 RooRealVar:: nsig = 0 L(0 - 170) “nsig”

[#0] PROGRESS:Generation – Test Statistic on data: 0
[#1] INFO:InputArguments – Profiling conditional MLEs for Null.
[#1] INFO:InputArguments – Using a ToyMCSampler. Now configuring for Null.

[3]
root [0] .L BkgLinearPdf.cxx+

RooFit v3.56 – Developed by Wouter Verkerke and David Kirkby
Copyright © 2000-2013 NIKHEF, University of California & Stanford University
All rights reserved, please read roofit.sourceforge.net/license.txt

root [1] .L OOnePdf.cxx+
root [2] .L SimpleHypoTestInv.C
root [3] SimpleHypoTestInv(“CountingModel.root”,“w”,“ModelConfig”,“ds”)
[#0] PROGRESS:Eval – AsymptoticCalculator: Find best unconditional NLL on observed data
AsymptoticCalculator::EvaluateNLL … using Minuit / Migrad with strategy 1 and tolerance 1


** 1 **SET PRINT 0



** 2 **SET NOGRAD


PARAMETER DEFINITIONS:
NO. NAME VALUE STEP SIZE LIMITS
1 cof 1.00000e+00 4.95000e-01 1.00000e-02 1.00000e+02
2 nbkg 1.69000e+02 5.00000e-01 0.00000e+00 1.70000e+02
3 nsig 0.00000e+00 1.70000e+01 0.00000e+00 1.70000e+02
MINUIT WARNING IN PARAM DEF
============== STARTING VALUE IS AT LIMIT.
MINUIT WARNING IN PARAMETR
============== VARIABLE3 IS AT ITS LOWER ALLOWED LIMIT.
MINUIT WARNING IN PARAMETR
============== VARIABLE3 BROUGHT BACK INSIDE LIMITS.


** 3 **SET ERR 0.5



** 4 **SET PRINT 0



** 5 **SET STR 1



** 6 **MIGRAD 1500 1


MINUIT WARNING IN MIGrad
============== VARIABLE3 IS AT ITS LOWER ALLOWED LIMIT.
MIGRAD MINIMIZATION HAS CONVERGED.
MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
MINUIT WARNING IN HESSE
============== Negative diagonal element 1 in Error Matrix
MINUIT WARNING IN HESSE
============== 1 added to diagonal of error matrix
FCN=-314.635 FROM MIGRAD STATUS=CONVERGED 186 CALLS 187 TOTAL
EDM=2.56192e-05 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 cof 1.00000e+00 2.25275e+01 5.00000e-01 -8.52651e-13
2 nbkg 1.69250e+02 2.37146e+01 1.40402e-02 2.42609e-03
3 nsig 1.71520e+00 4.25928e+00 3.23023e-03 1.71304e-02
ERR DEF= 0.5
AsymptoticCalculator::EvaluateNLL - value = -314.635 fit time : Real time 0:11:15, CP time 666.850
[#0] PROGRESS:Eval – Best fitted POI value = 1.7152 +/- 4.25928
[#0] PROGRESS:Eval – AsymptoticCalculator: Building Asimov data Set
[#1] INFO:InputArguments – AsymptoticCalculator: Asimov data will be generated using fitted nuisance parameter values
MakeAsimov: Setting poi nsig to a constant value = 0
MakeAsimov: doing a conditional fit for finding best nuisance values

Hi,

I don’t understand… Does the macro just finish after printing ?

Are you then afterwards back in the ROOT prompt or does ROOT exit as well ? In that case it is due to some crash happening somewhere…

You can try increasing the print level, by calling before running your macro, for example

ROOT::Math::Minimizer::SetDefaultPrintLevel(1);
RooStats::AsymptoticCalculator::SetPrintLevel(1);  

Otherwise I would need your code and your workspace to understand the problem
Lorenzo

Hi, Lorenzo,

Thanks for your reply and suggestion !

I’ve followed your suggestion by adding the two lines, while it doesn’t print much more than before, as attached here [4].

Besides, I have mentioned that the script quitted without any errors, which means the script quitted from the ROOT prompt and backed to the Linux one.

To avoid too busy at this reply, I’ll attach my scripts in the next reply for your diagnosis.

Thanks in advance !

Best,
Junhui

[4]
root [0] .L BkgLinearPdf_cxx.so

RooFit v3.56 – Developed by Wouter Verkerke and David Kirkby
Copyright © 2000-2013 NIKHEF, University of California & Stanford University
All rights reserved, please read roofit.sourceforge.net/license.txt

root [1] .L OOnePdf_cxx.so
root [2] .L SimpleHypoTestInv.C
root [3] SimpleHypoTestInv(“CountingModel.root”,“w”,“ModelConfig”,“ds”)
[#1] INFO:InputArguments – HypoTestInverter ---- Input models:
using as S+B (null) model : ModelConfig
using as B (alternate) model : ModelConfig_with_poi_0

poimin = 0, poimax = 170
Doing a fixed scan in interval : 0 , 170

test flag 6
[#1] INFO:Eval – HypoTestInverter::GetInterval - run a fixed scan
[#1] INFO:ObjectHandling – RooWorkspace::saveSnaphot(w) replacing previous snapshot with name ModelConfig__snapshot

=== Using the following for ModelConfig ===
Observables: RooArgSet:: = (x)
Parameters of Interest: RooArgSet:: = (nsig)
Nuisance Parameters: RooArgSet:: = (a,cof,nbkg)
Global Observables: RooArgSet:: = (a)
PDF: RooAddPdf::model[ nsig * sig_pdf + nbkg * bkg_pdf ] = 2.4375
Snapshot:

  1. 0x264ee80 RooRealVar:: nsig = 0 L(0 - 170) “nsig”

=== Using the following for ModelConfig_with_poi_0 ===
Observables: RooArgSet:: = (x)
Parameters of Interest: RooArgSet:: = (nsig)
Nuisance Parameters: RooArgSet:: = (a,cof,nbkg)
Global Observables: RooArgSet:: = (a)
PDF: RooAddPdf::model[ nsig * sig_pdf + nbkg * bkg_pdf ] = 2.4375
Snapshot:

  1. 0x2654880 RooRealVar:: nsig = 0 L(0 - 170) “nsig”

EvaluateProfileLikelihood - mu hat = 1.66212, uncond ML = -314.635, cond ML = -314.635 pll = 0 time (create/fit1/2) 2.35 , 645.73 , 0
[#0] PROGRESS:Generation – Test Statistic on data: 0
[#1] INFO:InputArguments – Profiling conditional MLEs for Null.
[#1] INFO:InputArguments – Using a ToyMCSampler. Now configuring for Null.

Hi, Lorenzo,

This is the script of “CountingModel.C” to produce a workspace [5].

[5]



#include "TTree.h"

using namespace RooFit;
using namespace RooStats;

void CountingModel(  int nsig = 0,         // number of signal events .
                     int nbkg = 170)       // number of background events.
{
   RooWorkspace w("w");

   ///~~~~~~~~~~~~~~~~~~  Begining of creating a model p.d.f. ~~~~~~~~~~~~~~///

  RooClassFactory::makePdf("BkgLinearPdf","x,A,B") ;
#ifdef __CINT__
  gROOT->ProcessLineSync(".x BkgLinearPdf.cxx+") ;
#endif


  RooClassFactory::makePdf("OOnePdf","x,cof") ;
#ifdef __CINT__
  gROOT->ProcessLineSync(".x OOnePdf.cxx+") ;
#endif

   w.factory("BkgLinearPdf:bkg_pdf(x[0,10], a[2.5,1.0,30.0], b[-0.0125])");/// a,interceipt; b, slope.

   w.factory("OOnePdf:sig_pdf(x[0,10],cof[1.0,0.01,100.0])");

   w.factory("SUM:model(nsig[0,170]*sig_pdf, nbkg[0,170]*bkg_pdf)");  // for extended model

   ///~~~~~~~~~~~~~~~~~~  End of creating a model p.d.f. ~~~~~~~~~~~~~~///

   
   ///~~~~~~~  Begining of creating the ModelConfig object and set "b0" as a global observable ~~~///

   w.var("nsig")->setVal(nsig); 
   w.var("nbkg")->setVal(nbkg); 
   w.var("a")->setConstant(true); // needed for being treated as global observables
   
   ModelConfig mc("ModelConfig",&w);/// declare the name and reference of ModelConfig as "mc" and "ModelConfig".
   mc.SetPdf(*w.pdf("model")); 
   mc.SetParametersOfInterest(*w.var("nsig")); 
   mc.SetObservables(*w.var("x"));  

   w.defineSet("nuisParams","a,cof,nbkg");
   mc.SetNuisanceParameters(*w.set("nuisParams") );

   // these are needed for the hypothesis tests
   mc.SetSnapshot(*w.var("nsig"));
   // need now to set the global observable
   mc.SetGlobalObservables(*w.var("a"));

   mc.Print();
   // import model in the workspace 
   w.import(mc);

   
   ///~~~~~~~  End of creating the ModelConfig object and set "b0" as a global observable ~~~///

  // import a tree
  TFile *treefile = TFile::Open("measuredEnergyTree.root");
  TTree* tree = (TTree*) treefile->Get("ntuple");
   
  // Define observables x
  RooRealVar x("x","x",1,0,11) ;

  // Import only observables (x)
  RooDataSet ds("ds","ds",*w.var("x"),Import(*tree)) ; ///refered from rf102_dataimport.C
  ///RooDataSet ds("ds","ds",*w.var("x"),Import(*tree)) ;
  ds.Print() ;
  ds.add( *w.var("x"));
  ///ds.add( RooArgSet(x));


  RooPlot* frame3 = x.frame(Title("import data from a tree")) ;
  ds.plotOn(frame3) ;

  TCanvas* c = new TCanvas("import data from a tree","import data from a tree ",800,800) ;
   gPad->SetLeftMargin(0.15) ; frame3->GetYaxis()->SetTitleOffset(1.4) ; frame3->Draw() ;


   w.import(ds);

   w.Print();

   TString fileName = "CountingModel.root"; 

   // write workspace in the file (recreate file if already existing)
   w.writeToFile(fileName, true);

}

This is the code of “SimleHypoTestInv.C”[6].
I’ve put some “test flag” there to know where the script has been stuck.
It turns out my script is stuck at the line of
" HypoTestInverterResult * r = calc.GetInterval(); "
Because I’ve seen “test flag 6”, but didn’t see “test flag 7”.
You can see the feature from my previous reply.

[6]


using namespace RooStats;
using namespace RooFit;
#include "TMath.h"

void SimpleHypoTestInv( const char* infile =  "CountingModel.root", 
                        const char* workspaceName = "w",
                        const char* modelConfigName = "ModelConfig",
                        const char* dataName = "ds" )
{

   ROOT::Math::MinimizerOptions::SetDefaultPrintLevel(1);
   RooStats::AsymptoticCalculator::SetPrintLevel(1);  

  /////////////////////////////////////////////////////////////
  // First part is just to access the workspace file 
  ////////////////////////////////////////////////////////////

  // open input file 
  TFile *file = TFile::Open(infile);
  if (!file) return;

  // get the workspace out of the file
  RooWorkspace* w = (RooWorkspace*) file->Get(workspaceName);


  // get the modelConfig out of the file
  RooStats::ModelConfig* mc = (RooStats::ModelConfig*) w->obj(modelConfigName);

  // get the data out of the file
  RooAbsData* data = w->data(dataName);

  ModelConfig*  sbModel = (RooStats::ModelConfig*) w->obj(modelConfigName);
  RooRealVar* poi = (RooRealVar*) sbModel->GetParametersOfInterest()->first();
  /// background model created from s+b model by setting nsig = 0.
  ModelConfig * bModel = (ModelConfig*) sbModel->Clone();
  bModel->SetName(TString(sbModel->GetName())+TString("_with_poi_0"));      
  poi->setVal(0);
  bModel->SetSnapshot( *poi  );

  FrequentistCalculator  fc(*data, *bModel, *sbModel);
  ///fc.SetToys(3000,1500);  
  fc.SetToys(1000,500);  

  // asymptotic calculator
  /// AsymptoticCalculator  ac(*data, *bModel, *sbModel);
  ///  ac.SetOneSided(true);  // for one-side tests (limits)
  //  ac->SetQTilde(true);
  /// AsymptoticCalculator::SetPrintLevel(-1);


  // create hypotest inverter 
  // passing the desired calculator 
  ///HypoTestInverter calc(ac);    // for asymptotic 
  HypoTestInverter calc(fc);  // for frequentist

  // set confidence level (e.g. 95% upper limits)
  calc.SetConfidenceLevel(0.95);
  
  // for CLS
  bool useCLs = true;
  calc.UseCLs(useCLs);
  calc.SetVerbose(false);

  // configure ToyMC Samler (needed only for frequentit calculator)
  ToyMCSampler *toymcs = (ToyMCSampler*)calc.GetHypoTestCalculator()->GetTestStatSampler();
  
  // profile likelihood test statistics 
  ProfileLikelihoodTestStat profll(*sbModel->GetPdf());
  // for CLs (bounded intervals) use one-sided profile likelihood
  if (useCLs) profll.SetOneSided(true);

  // ratio of profile likelihood - need to pass snapshot for the alt 
  // RatioOfProfiledLikelihoodsTestStat ropl(*sbModel->GetPdf(), *bModel->GetPdf(), bModel->GetSnapshot());
   
  // set the test statistic to use 
  toymcs->SetTestStatistic(&profll);

  // if the pdf is not extended (e.g. in the Poisson model) 
  // we need to set the number of events
  if (!sbModel->GetPdf()->canBeExtended())
     toymcs->SetNEventsPerToy(1);


  int npoints = 170;  // number of points to scan
  // min and max (better to choose smaller intervals)
  double poimin = poi->getMin();
  double poimax = poi->getMax();
  ///poimin = 0; poimax=10;
  std::cout << endl << "poimin = " << poimin << ", poimax = " << poimax <<  endl;

  std::cout << "Doing a fixed scan  in interval : " << poimin << " , " << poimax << std::endl;
  calc.SetFixedScan(npoints,poimin,poimax);
  
  std::cout << endl << "test flag 6" << endl;
  // run inberter, only one line, :-).
  HypoTestInverterResult * r = calc.GetInterval();

  std::cout << endl << "test flag 7" << endl;
  double upperLimit = r->UpperLimit();

  std::cout << endl << "test flag 8" << endl;
  std::cout << "The computed upper limit is: " << upperLimit << std::endl;
  
  // compute expected limit
  std::cout << "Expected upper limits, using the B (alternate) model : " << std::endl;
  std::cout << " expected limit (median) " << r->GetExpectedUpperLimit(0) << std::endl;
  std::cout << " expected limit (-1 sig) " << r->GetExpectedUpperLimit(-1) << std::endl;
  std::cout << " expected limit (+1 sig) " << r->GetExpectedUpperLimit(1) << std::endl;
  

  // plot now the result of the scan 

  HypoTestInverterPlot *plot = new HypoTestInverterPlot("HTI_Result_Plot","HypoTest Scan Result",r);

  // plot in a new canvas with style
  TCanvas * c1 = new TCanvas("HypoTestInverter Scan"); 
  c1->SetLogy(false);

  plot->Draw("CLb 2CL");  // plot also CLb and CLs+b 
  //plot->Draw("OBS");  *// plot only observed p-value


  // plot also in a new canvas the test statistics distributions 
  
  // plot test statistics distributions for the two hypothesis
  // when distribution is generated (case of FrequentistCalculators)

/*
  const int n = r->ArraySize();
  if (n> 0 &&  r->GetResult(0)->GetNullDistribution() ) { 
     TCanvas * c2 = new TCanvas("Test Statistic Distributions","",2);
     if (n > 1) {
        int ny = TMath::CeilNint( sqrt(n) );
        int nx = TMath::CeilNint(double(n)/ny);
        c2->Divide( nx,ny);
     }
     for (int i=0; i<n; i++) {
        if (n > 1) c2->cd(i+1);
        SamplingDistPlot * pl = plot->MakeTestStatPlot(i);
        pl->SetLogYaxis(true);
        pl->Draw();
     }
  }
*/

}

Thanks !

Best,
Junhui

Hi, Lorenzo,

The problem has been solved ! :smiley: 8) .

Let me explain a little bit detailed how caught the bug.
As mentioned in my previous reply, the problem looks like come from the line of

HypoTestInverterResult * r = calc.GetInterval();

And apparently, the this line should have a very close connection to this line

calc.SetFixedScan(npoints,poimin,poimax)

After taking a close look on this website,
root.cern.ch/root/html/RooStats … tFixedScan, I realized I have made a mistake on the value of “points”.
In the “SetFixedScan”, the first variable, “nBins”, turns out to be the number of bins, not the number of entries !
That’s where I misunderstood.
For the imported tree file in my code, it has 100bins, 170 entries.
What I should do is solely to change 170 to 100 both in “CountingModel.C” and "SimpleHypoTestInv.C ".

The correct codes are [7],[8].

Anyway, thanks for your comments !

Best,
Junhui

[7]

#include "TTree.h"

using namespace RooFit;
using namespace RooStats;

void CountingModel(  int nsig = 0,         // number of signal events .
                     int nbkg = 100)       // number of background events.
{
   RooWorkspace w("w");

   ///~~~~~~~~~~~~~~~~~~  Begining of creating a model p.d.f. ~~~~~~~~~~~~~~///

  RooClassFactory::makePdf("BkgLinearPdf","x,A,B") ;
#ifdef __CINT__
  gROOT->ProcessLineSync(".x BkgLinearPdf.cxx+") ;
#endif


  RooClassFactory::makePdf("OOnePdf","x,cof") ;
#ifdef __CINT__
  gROOT->ProcessLineSync(".x OOnePdf.cxx+") ;
#endif

   w.factory("BkgLinearPdf:bkg_pdf(x[0,10], a[2.5,1.0,30.0], b[-0.0125])");/// a,interceipt; b, slope.

   w.factory("OOnePdf:sig_pdf(x[0,10],cof[1.0,0.01,100.0])");

   w.factory("SUM:model(nsig[0,100]*sig_pdf, nbkg[0,100]*bkg_pdf)");  // for extended model

   ///~~~~~~~~~~~~~~~~~~  End of creating a model p.d.f. ~~~~~~~~~~~~~~///

   
   ///~~~~~~~  Begining of creating the ModelConfig object and set "b0" as a global observable ~~~///

   w.var("nsig")->setVal(nsig); 
   w.var("nbkg")->setVal(nbkg); 
   w.var("a")->setConstant(true); // needed for being treated as global observables
   
   ModelConfig mc("ModelConfig",&w);/// declare the name and reference of ModelConfig as "mc" and "ModelConfig".
   mc.SetPdf(*w.pdf("model")); 
   mc.SetParametersOfInterest(*w.var("nsig")); 
   mc.SetObservables(*w.var("x"));  

   w.defineSet("nuisParams","a,cof,nbkg");
   mc.SetNuisanceParameters(*w.set("nuisParams") );

   // these are needed for the hypothesis tests
   mc.SetSnapshot(*w.var("nsig"));
   // need now to set the global observable
   mc.SetGlobalObservables(*w.var("a"));

   mc.Print();
   // import model in the workspace 
   w.import(mc);

   
   ///~~~~~~~  End of creating the ModelConfig object and set "b0" as a global observable ~~~///

  // import a tree
  TFile *treefile = TFile::Open("measuredEnergyTree.root");
  TTree* tree = (TTree*) treefile->Get("ntuple");
   
  // Define observables x
  RooRealVar x("x","x",1,0,10) ;

  // Import only observables (x)
  RooDataSet ds("ds","ds",*w.var("x"),Import(*tree)) ; ///refered from rf102_dataimport.C
  ///RooDataSet ds("ds","ds",*w.var("x"),Import(*tree)) ;
  ds.Print() ;
  ds.add( *w.var("x"));
  ///ds.add( RooArgSet(x));


  RooPlot* frame3 = x.frame(Title("import data from a tree")) ;
  ds.plotOn(frame3) ;

  TCanvas* c = new TCanvas("import data from a tree","import data from a tree ",800,800) ;
   gPad->SetLeftMargin(0.15) ; frame3->GetYaxis()->SetTitleOffset(1.4) ; frame3->Draw() ;


   w.import(ds);

   w.Print();

   TString fileName = "CountingModel.root"; 

   // write workspace in the file (recreate file if already existing)
   w.writeToFile(fileName, true);

}

[8]

using namespace RooStats;
using namespace RooFit;
#include "TMath.h"

void SimpleHypoTestInv( const char* infile =  "CountingModel.root", 
                        const char* workspaceName = "w",
                        const char* modelConfigName = "ModelConfig",
                        const char* dataName = "ds" )
{

   ROOT::Math::MinimizerOptions::SetDefaultPrintLevel(1);
   RooStats::AsymptoticCalculator::SetPrintLevel(1);  

  /////////////////////////////////////////////////////////////
  // First part is just to access the workspace file 
  ////////////////////////////////////////////////////////////

  // open input file 
  TFile *file = TFile::Open(infile);
  if (!file) return;

  // get the workspace out of the file
  RooWorkspace* w = (RooWorkspace*) file->Get(workspaceName);


  // get the modelConfig out of the file
  RooStats::ModelConfig* mc = (RooStats::ModelConfig*) w->obj(modelConfigName);

  // get the data out of the file
  RooAbsData* data = w->data(dataName);

  ModelConfig*  sbModel = (RooStats::ModelConfig*) w->obj(modelConfigName);
  RooRealVar* poi = (RooRealVar*) sbModel->GetParametersOfInterest()->first();
  /// background model created from s+b model by setting nsig = 0.
  ModelConfig * bModel = (ModelConfig*) sbModel->Clone();
  bModel->SetName(TString(sbModel->GetName())+TString("_with_poi_0"));      
  poi->setVal(0);
  bModel->SetSnapshot( *poi  );

  FrequentistCalculator  fc(*data, *bModel, *sbModel);
  ///fc.SetToys(3000,1500);  
  fc.SetToys(1000,500);  

  // asymptotic calculator
  /// AsymptoticCalculator  ac(*data, *bModel, *sbModel);
  ///  ac.SetOneSided(true);  // for one-side tests (limits)
  //  ac->SetQTilde(true);
  /// AsymptoticCalculator::SetPrintLevel(-1);


  // create hypotest inverter 
  // passing the desired calculator 
  ///HypoTestInverter calc(ac);    // for asymptotic 
  HypoTestInverter calc(fc);  // for frequentist

  // set confidence level (e.g. 95% upper limits)
  calc.SetConfidenceLevel(0.95);
  
  // for CLS
  bool useCLs = true;
  calc.UseCLs(useCLs);
  calc.SetVerbose(false);

  // configure ToyMC Samler (needed only for frequentit calculator)
  ToyMCSampler *toymcs = (ToyMCSampler*)calc.GetHypoTestCalculator()->GetTestStatSampler();
  
  // profile likelihood test statistics 
  ProfileLikelihoodTestStat profll(*sbModel->GetPdf());
  // for CLs (bounded intervals) use one-sided profile likelihood
  if (useCLs) profll.SetOneSided(true);

  // ratio of profile likelihood - need to pass snapshot for the alt 
  // RatioOfProfiledLikelihoodsTestStat ropl(*sbModel->GetPdf(), *bModel->GetPdf(), bModel->GetSnapshot());
   
  // set the test statistic to use 
  toymcs->SetTestStatistic(&profll);

  // if the pdf is not extended (e.g. in the Poisson model) 
  // we need to set the number of events
  if (!sbModel->GetPdf()->canBeExtended())
     toymcs->SetNEventsPerToy(1);


  int npoints = 100;  // number of points(bins ?) to scan
  // min and max (better to choose smaller intervals)
  ///double poimin = poi->getMin();
  ///double poimax = poi->getMax();
  double poimin = 0.; poimax = 10.;
  std::cout << endl << "poimin = " << poimin << ", poimax = " << poimax <<  endl;

  std::cout << "Doing a fixed scan  in interval : " << poimin << " , " << poimax << std::endl;
  calc.SetFixedScan(npoints,poimin,poimax);
  
  std::cout << endl << "test flag 6" << endl;
  // run inberter, only one line, :-).
  HypoTestInverterResult * r = calc.GetInterval();

  std::cout << endl << "test flag 7" << endl;
  double upperLimit = r->UpperLimit();

  std::cout << endl << "test flag 8" << endl;
  std::cout << "The computed upper limit is: " << upperLimit << std::endl;
  
  // compute expected limit
  std::cout << "Expected upper limits, using the B (alternate) model : " << std::endl;
  std::cout << " expected limit (median) " << r->GetExpectedUpperLimit(0) << std::endl;
  std::cout << " expected limit (-1 sig) " << r->GetExpectedUpperLimit(-1) << std::endl;
  std::cout << " expected limit (+1 sig) " << r->GetExpectedUpperLimit(1) << std::endl;
  

  // plot now the result of the scan 

  HypoTestInverterPlot *plot = new HypoTestInverterPlot("HTI_Result_Plot","HypoTest Scan Result",r);

  // plot in a new canvas with style
  TCanvas * c1 = new TCanvas("HypoTestInverter Scan"); 
  c1->SetLogy(false);

  plot->Draw("CLb 2CL");  // plot also CLb and CLs+b 
  //plot->Draw("OBS");  *// plot only observed p-value
}

Thanks for letting us known you solved the problem

Cheers

Lorenzo

Hi, Lorenzo,

Thanks for your comments anyway !
I will tag it as “solved” afterwards.
However, there’re a small piece of question I’d like to ask.

Taking the exercise 7 on the page as an example, twiki.cern.ch/twiki/bin/view/Ro … del_and_fi .
On the produced plot of p-value vs s .
The notation of “Observed CLs, CLs+b, CLb” and Expected "CLs - Median, CLs ± 1 sigma, CLs ± 2 sigma " confused me a little bit.
Because in this example, the null hypothesis is “b only” while the alternative one is “s + b”.
So, I was wondering, the " Observed CLs, CLs+b, CLb " are the tested results of “s + b” model or of the "b only " one ? If it’s from “s + b” model, then it probably should change to “EXPECTED CLs, CLs+b, CLb” because this it’s an alt model, not a null one; if it’s from “b only” model, then there should exist only “observed CLb” because under this case, only background(and no signal) in this null hypothesis !

Thanks !

Best,
Junhui

Hi,

In the case of a significance test, i.e to test how much the background hypothesis is not compatible with the observed data, it is correct that the null hypothesis consists of the background only model.
In that case the returned CLb value from RooStats is the null-p values, while CLs+b is the alt p-value.
The returned CLs, computed as CL(s+b)/CL(b) does not make sense, so please ignore.
Eventuall, in case of very low null and alt p-value, one could compute (as it is done for th limits) a corrected null p-value as CL(b)/CL(s+b)., but it is not done in this case.

The expected p-values are the null-p values (expected CLb), obtained when instead of the observed value of the
data test statistic, the median (or the +/ n-sigma values) of the alternative (S+B model) test statistic distribution is used.

Best Regards

Lorenzo

Hi, Lorenzo,

Thanks for your detailed explanation !

I have two more questions on the interpretation of the results.
Taking the exercise 7 as an example on this page,
twiki.cern.ch/twiki/bin/view/Ro … del_and_fi .

Q1 :
From the plot of “HypoTest Scan Result”, the p-values : “observed CL(b)” < “observed CL(s+b)” < “expected CL(s)- Median”, which means the data are more likely to follow the model of “s+b”, not the model of “b only”.
Can I interpret like this ?

Q2 :
The p-values of models turn out don’t have scientific sense ? Which sounds a little bit weird.
The confidence level has been set to 0.95, which means the hypothesis should be rejected at 95% CL if p < 0.05.
While, from this plot, the p-values of null hypothesis(“Observed CL(b)”, “b only” model) always >= 0.05. That’s to say, the null hypothesis shouldn’t be rejected, the data is more background-like model ? This is conflict with Q1.
Apparently, this isn’t the case since the data has generated by the same model in exercise 4 on the page mentioned above. Which means the data should be a more signal-like one.

Could you please comment on these ?

Thanks !

Best,
Junhui

Hi,

Let me understand this correctly. You are referring to Exercise 7, which is scan of hypothesis tests to invert it, i.e. to find the parameter value corresponding at a given p-value (e.g. 0.05).
This is done for computing a list and not a significance. In this case the null hypothesis is the S+B model, because we want to compute a limit, i.e. reject the hypothesis that we have signal in our data.
What I was saying in the previous posts refers to a significance test (e.g. Exercise 6).

  1. Now in a Hypothesis inversion scan (for computing a limit/interval), if we have CLb < CLs+b, it means that the data prefer S+B model. So in this case you cannot really set a limit, but compute an interval. The correct way to do in that case would be to use also a two sided test-statistics and compute a Feldman-Cousins interval.
    There is no need to use CLs, but to use directly CLs+b
    More information is from slides #76 of
    indico.desy.de/getFile.py/acces … nfId=11244

  2. The data set used in Exercise 7 is more background-like. You see that CLs+b is always smaller than CLb, apart for very small values of the parameter of interest (signal rate). We see that for s > 6 the CL(s+b) is smaller than 0.05, so we can reject the S+B null hypothesis and set a 95% limit for s at around 6.

Lorenzo

Hi, Lorenzo,

Thanks for your latest comments ! Which is very useful .

There actually exist another confusion on the output of exercise 7.

To be honest, event after reading this paper, arxiv.org/abs/1007.1727v3 , I still don’t think I understand the plot(see the attachment please) very well : does the curve of “ModelConfig_with_poi_0” and “ModelConfig” represent equation[7] of that paper for the models of “b only” and “s+b” separately, and the black curve of “test statistic data” is equivalent to equation[14] ?

Could you please comment a little bit on it ?

Thanks !

Best,
Junhui
SimpleHypoTestInvProfLLRatio.pdf (36.3 KB)

Hi

In the attached plot what is shown is the test statistics as defined in eq. (14). It is the same definition of the test statistic, but it is evaluated on :

  • pseudo-experiments generated with signal at a certain mu value + background (red-curve)
  • pseudo-experiment generated with background only (mu=0) (blue-curve)
  • value on the observed data (black line)

The plots are done for different mu-value. When mu-changes the test statistic definitions also changes (it -depends on mu), for this you see a difference in the test statistic distribution of the blue histogram, although is obtained with pseudo-experiments generated always with mu=0

Lorenzo

Hi, Lorenzo,

Thanks again !

Everything is almost clear now, except one, :slight_smile:.
Which is “value on the observed data (black line)”.
Do you mean this is “a variable x” ? as defined two lines above of the eq.(2) of same paper ?

PS. As mentioned on the page 7 of the paper, eq.(14)beneath :
"… higher values of $q_\mu$ represent greater incompatibility between the data and the hypothesized value of $\mu$ " .
From the plot I’ve attached in my previous post, one actually can interpret that the hypothesis of “background only”(blue) is disfavored because its $q_\mu$ is almost always bigger than “signal + background” hypothesis(red).
Am I right ?

Thanks !

Best,
Junhui

Hi,

The black line is the test statistic value (defined as in eq. 14) evaluated on the observed data set. It is q_mu evaluated at the test value of mu and on the data set values x.

It is correct, the sentence
"… higher values of $q_\mu$ represent greater incompatibility between the data and the hypothesized value of $\mu$ " .

In this case you are comparing test statistic values obtained from data set generated with mu=0 (blue curve) and test statistics value obtained from data set generated with mu=mu_test (red-curve)
It is then obvious that the blue-curve is not compatible with the hypothesized value ofmu, which is mu_test.

Lorenzo

Hi, Lorenzo,

Thanks for all of your replies !

Let me summarize our discussion a little bit here and hopefully it’s useful for later visitors.
I’m trying to link the paper(twiki.cern.ch/twiki/bin/view/Ro … del_and_fi).
Please feel free to comment and, correct me if I’m wrong.

Thanks !

Best,
Junhui


0). The idea of this test statistic is, based on experimental data, one compares which model is more applaudable: "s + b" or "b only". The way of comparison is to calculate eq.(14). Bigger the value it is, greater the incompatibility it is.

1). To calculate eq.(14), one essentially calculates eq.(7), in a way.
The "\mu" in the numerator is the variable decided by the range of "s" of exercise 4 and the "npoints" of the exercise 7 on the tutorial page, which are [0,15]  and 10 separately. Therefore, the interval of "s" is 15/(10 - 0 +1) = 15 / 9 = 5/3; the values of "s" are 5/3, 10/3, 15/3 =3, 20/5 ... 45/3 = 15. These values of "s" are called "\mu" in the notation of the paper.

2). The whole procedure of calculating eq.(14) is following.
For a given "s" or "\mu"(the values of 5/3, 10/3 etc in section 1).), 
2.1). RootFit calculates all of the pieces of eq.(7) : "\mu hat", conditional maximum-likelihood(ML), unconditional ML, and profile likelihood ratio; then calculates eq.(14) and p values. The number of events generated for a given "\mu" depends on the toy numbers which have been assigned by the line of "fc.SetToys(1000,500);" in exercise 7.

2.2). After getting eq.(7), eq.(14) is easy to reach.

PS 1. The "p value" not only depends on eq.(14), but " q \mu, obs ". This is shown in eq.(15).

PS 2 . The test statistic data or black line can be obtained by substituting "\mu hat" with experimental data(the "\mu" value is known at each scanning point) to calculate eq.(7) and then eq.(14).