Calling macro several times - Unload error

maikenp · August 2, 2010, 5:55pm

Hi,

I have a macro that calls a set of classes, and I want to do this several times, each time with a different inputfile, and each time storing the result in a dedicated file for this inputfile. A simplified version of the macro is quoted below:

void runSelector(TString inputfileList) {

  TChain *c = new TChain("susy");
  
  ifstream in(inputfileList);
  std::string temp;

  TString file;
  /*LOOPING OVER THE LINES IN THE FILELIST - EACH LINE CONTAINS ONE DATASET*/
  while( getline(in,temp)){
    file = temp;
     
    c->Add(file);
    c->Process("D3PDSelector.C++g");

   //c->Unload("D3PDSelector.C");
  }  
}

Doing this I get the following message before my code crashes:

I would like to be able to call ROOT only once, loop over the datasets in the inputfilelist, and store the output for each line in the inputfilelist separately. Is there a way to do this? I tried to unload D3PDSelector as you can see from the code above, but unfortunately this did not work (I might have misunderstood the procedure here).

Hope there is a simple solution to this, thank you very much
Maiken

Axel · August 3, 2010, 12:22pm

Hi,

you could separate the compilation of the selector and the processing. I.e.

void runSelector(TString inputfileList) {
  if (!TClass::GetClass("D3PDSelector")) {
     gROOT->ProcessLine(".L D3PDSelector.C++g");
  }
  TChain *c = new TChain("susy");
 
  ifstream in(inputfileList);
  TString file;
  while(file.ReadLine(in)){
    c->Add(file);
    D3PDSelector *sel = new D3PDSelector();
    c->Process(sel);
    delete sel;
  } 
}

Though I fail to understand why you add file 1 to the chain, process the chain containing file 1, then add file 2 and process the chain containing file 1 and 2, then add file 3 and process the chain containing file 1, 2 and 3 etc. You probably wanted to write simply:

void runSelector(TString inputfileList) {
  TChain *c = new TChain("susy");
  ifstream in(inputfileList);
  TString file;
  while(file.ReadLine(in)){
    c->Add(file);
  } 
  c->Process("D3PDSelector.C+g");
  delete c
}

Cheers, Axel.

maikenp · August 3, 2010, 1:03pm

Hi,

thank you very much for your reply.

Well, your last suggestion will add all the datasets in one chain, and then loop over all these datasets and finally output a result. I want to loop over each dataset individually and output a result per dataset. As each dataset can contain several files I use a chain, but do not want to add all the datasets together in this chain, only the files belonging to the current dataset I am looping over.

With the first suggestion, I get a segmentation violation due to the
delete sel;

without this, though, I end up summing up all the chains, so that each result written out is the consecutive sum of the previous results, which I do not want. I would just like to open ROOT once, read in one dataset at a time, write out the result, and then treat the next dataset, without having to start and stop ROOT inbetween. But I seem always to get problems with loading, unloading, or for instance messages like: [quote]Error in TSelectorList::CheckDuplicateName: an object with the same name: data is already in the list[/quote]

thanks
Maiken

Axel · August 3, 2010, 2:00pm

Hi,

I still don’t understand your while loop. I assume one data set corresponds to one call of runSelector()? Then your while loop is wrong, as I said. Mine should be okay, because it adds files (of the same data set) to the chain, and then processes all files of that dataset. If this doesn’t work then please describe the relation data set / runSelector() / inputfileList / file more precisely.

Cheers, Axel.

maikenp · August 3, 2010, 3:06pm

Hi again,

sorry for not explaining properly.

What I initially did was to have a python script iterate over the datasets, and open root with runSelector.C each time, then quitting ROOT again per dataset. Since it takes a fairly long time to open and close ROOT for each dataset, I wanted to just loop over the call to my analysis (D3PDSelector.C) in runSelector.C without opening and closing ROOT each time, this is why I added the looping over the datasets in the runSelector.C that I posted earlier.

I attach the typical filelist with the datasets included (here you see that each dataset has a wildcard so that all files in this dataset is included). Then I attach the python script dssetlooper.py that initially called runSelector.C, and finally I attach my original runSelector.C. In this way you might better understand what I tried to do.

what I do then is
python dssetlooper.py “filelist_test.txt”

What I want now is the most time-efficient way of iterating over all the datasets, making sure that the “environment” is cleaned each time so that a new file is stored per dataset, and that the variable e.g. TParameter data, is read in again and so on.

Hope this is more clear.
Thanks
Maiken

[quote=“Axel”]Hi,

I still don’t understand your while loop. I assume one data set corresponds to one call of runSelector()? Then your while loop is wrong, as I said. Mine should be okay, because it adds files (of the same data set) to the chain, and then processes all files of that dataset. If this doesn’t work then please describe the relation data set / runSelector() / inputfileList / file more precisely.

Cheers, Axel.[/quote]
filelist_test.txt (876 Bytes)
runSelector.C (440 Bytes)
dssetlooper.py (404 Bytes)

Axel · August 5, 2010, 1:40pm

Hi,

okay, now I understand This should work:

#include "D3PDSelector.C+g"

void runSelector(TString inputfileList) {
  ifstream in(inputfileList);
  TString file;
  while(file.ReadLine(in)){
    TChain *c = new TChain("susy");
    c->Add(file);
    D3PDSelector *sel = new D3PDSelector();
    c->Process(sel);
    delete sel;
    delete c;
  }
}

If it still crashes in delete sel (which I would find weird) then just remove it - the memory leak should be small.

Cheers, Axel.

pcanal · August 5, 2010, 3:26pm

Hi,

If the crash is actually in the ‘delete c;’ then the problem has been fixed by revision 33354 of the trunk (i.e. v5.27/04 and up). See savannah.cern.ch/bugs/?25104

Cheers,
Philippe.

maikenp · August 6, 2010, 10:01am

Hi,

thanks a lot for your suggestion. As you indicate, I actually do get a crash when I do delete sel; (and I do not get a crash for delete c; but get the problem below:)

However, if I do not do delete sel;, I end up where I was earlier, that the variables I use etc are not reset, for instance: all the histograms are just accumulating events as I run over the different datasets. How should I go about cleaning the environment properly before starting a new iteration in the file-loop in runSelector.C?

I attach the stack trace here, so that you easier can spot the problem. I am using ROOT versjon 5.27/04 which should as I read from the documentation, include the bug-fix mentioned in the last post in this thread (the delete c; issue ).
stackTrace.txt (5.07 KB)

pcanal · August 6, 2010, 12:53pm

Hi Maiken,

In the example given by Axel, a new selector object is recreated each time and assumedly this selector would properly initialize its internal data members and at worst you should see a memory leak. So I am guessing that either I completely misunderstand how you set up your loop or it is a problem in the initialization of the values. To investigate further we would need more precision on the code (the best being a complete running example).

Cheers,
Philippe.