Subtracting histogramms from different files

Dear Experts,

I have let say 5 files f0.root, f1.root, f2.root, f3.root, f4.root, f5.root and they contain off course the same histograms. I want to take each key from files f1.root to f5.root and then subtract each key from the file f0.root. I am just copying the skeleton as under. It gives intended result except in the end of the loop it takes the entries of just the last opened file. Could you please point me where I am getting wrong.

void list_files(
    const char *idirname="/dir/",
    const char *type=".root"){

    TSystemDirectory idir(idirname, idirname);
    TList *files = idir.GetListOfFiles();
    cout <<" In Path : "<<idirname<<endl;
    std::vector<std::string> samples;
    std::vector<std::string> variables;

    TH1F *hsig;
    TH1F *hbkg;
    TH1F *htmp;

    if(files){
        TSystemFile *file;
        TString fname;
        TIter next(files);
    while ((file=(TSystemFile*)next())){
        fname = file->GetName();
        if (!file->IsDirectory() && fname.EndsWith(type)){
            cout <<" File names : " <<fname.Data() << endl;
            samples.push_back(fname.Data());
            if (fname.Contains("f0")){
                TFile f(idirname+fname);
                for(auto && keyAsObj : *f.GetListOfKeys()){
                    auto key = (TKey*) keyAsObj;
                    //cout << key->GetName() << " " << key->GetClassName() << endl;
                    variables.push_back(key->GetName());
                    //hsig = (TH1F*)f.FindObjectAny("mass");
                    hsig = (TH1F*)f.Get("mass")->Clone();
                    hsig->SetDirectory(0);
                    }
                }
            else{
                Char_t filename[0];
                sprintf(filename,"%s%s",idirname,fname.Data());
                TFile g(filename);
                htmp = (TH1F*)g.Get("mass")->Clone();
                htmp->SetDirectory(0);
                cout <<" htmp : "<<htmp->Integral()<<endl;
                }
            }
        }
    }
    hsig->Add(htmp,-1);
    cout << "Entries in sig: "<<hsig->Integral()<<" Entries in bkg : "<<htmp->Integral()<<endl;
    TFile outfile("output.root","RECREATE");
    hsig->Write();
    outfile.Close();

delete files;
}

Hi,

hsig->Add(htmp,-1);Probably should inside the loop rather than outside.

Cheers,
Philippe.

Well, the proposed solution dosen’t seem to work :slight_smile:

Are the histograms by any change profile histograms? Because the add function doesn’t work properly for profiles.

In that case you can make a projection of the profile and use the projections for the subtraction.

Dear NikkieD,

Thank you for your reply, the root files contain simple histograms. I am not sure if we have something like “hsubtract” similar to hadd. Life would be really nice if we can have such as an executable, and I also don’t know the reason for not having something like hsubtract.

Regards

[quote]Well, the proposed solution dosen’t seem to work :slight_smile:[/quote]Can you be more specific in what you tried and how it still fails?

Thanks,
Philippe.

Hi Phillipe,

I mean It doesn’t give the desired results. I am copying the tar if you would like to have a look.
files.tar (750 KB)

[quote]I mean It doesn’t give the desired results[/quote]How does it differ?

You have: while ((file=(TSystemFile*)next())){ ... hsig = (TH1F*)f.Get("h_2j1t_mtw")->Clone(); hbkg = (TH1F*)f.Get("h_2j1t_mtw")->Clone(); hsig->SetDirectory(0); hbkg->Reset("ICES"); hbkg->SetDirectory(0); }and hsig and hbkg are outer variable. They will be assign to the ‘content’ of the last file … is that the intent?

You have htmp = (TH1F*)g.Get("h_2j1t_mtw")->Clone(); //cout <<" htmp : "<<htmp->Integral()<<endl; htmp->SetDirectory(0); hsig->Add(htmp,-1); hbkg->Add(htmp); g.Close(); h_2j1t_mtw is there unnecessarily duplicate and the duplicate is leaked (never deleted).

You have the loop: while ((file=(TSystemFile*)next())){ fname = file->GetName(); if (!file->IsDirectory() && fname.EndsWith(type)){ if (fname.Contains("DDQCD_muonantiiso")){ seemingly twice … as far as I can tell besides the list (like listSig) contains n_histo * n_files items rather than n_items … it looks like the first files is ‘added/substracted’ twice …

Cheers,
Philippe.

Hi Phillip,

Thank you for your reply. Actually they are declared out side intentionally just to make sure that they exit and then they are reseted to zero but its just for being safe, to avoid other hidden problems.

Actually, it works as intended only when we give the name of histogram by hand let say “h_2j1t_mtw” and it fails to reproduce the same result when done automatically.

You are right that “h_2j1t_mtw” is duplicated but its done to cross-check if we get the same result. If you look at the end of output there three lines which are printed

key Name : h_2j1t_mtw | Entries in subtracted 5541.54 | Entries in bkg 2026.46
key Name : h_2j1t_mtw | Entries in subtracted 5541.54 | Entries in bkg 2026.46 
Entries in subtracted : 4668.58 Entries in bkg : 2899.42

Do you see the difference in Entries in bkg : 2899.42 vs Entries in bkg 2026.46.
The last line is what we get from “h_2j1t_mtw” i.e. when only one histogram is taken by hand, the other two lines are the one which I get when I iterate over the list of keys and trying to do it for all the keys present in the root file. In principle they all should give same result.

Hi,
I forgot to mention here that the numbers:

are checked by hand from each root file added/subtracted together and they are correct, where as the number Entries in bkg 2026.46 are the entries in just one file and it is taken to be same for
the rest of files which makes it a total of Entries in bkg 2026.46

Hi,

You may have missed the explanation in the User’s Guide on ‘cycles’ and keys. Your input files contains:root [1] file->ls() TFile** DDQCD_muonantiiso.root TFile* DDQCD_muonantiiso.root KEY: TH1F h_3j1t_MuPhi;2 KEY: TH1F h_3j1t_MuPhi;1 ..... KEY: TH1F h_2j1t_mtw;2 KEY: TH1F h_2j1t_mtw;1 ....where the key with ;X with the highest number contains the last save state of the histogram while the lower version numbered correspond to backup copies of previous Write-ing of the histogram.

In your case you are only interested in the highest numbered keys. By calling TFile::Get (rather than asking the TKey directly), you do get the last copy but the code is a test to ‘skip’ the keys with the same name in the same file. One way to accomplish what you need is to replace your

std::vector<string>keys;

by

std::set keys;

Cheers,
Philippe.