Memory issues with reading and writing histograms to file

Dear ROOTers,
For my analysis, I have a piece of code that works fine using nominal systematics. The structure of the code looks something like this

std::map < std::string, TH1*> histogram_map {};
//declare and fill histogram map with some histograms
//open output file
//(loop over events in file){
   // calculate weight of event
   // calculate some reconstructed objects
   // fill histograms with results
}
//save histograms
//close file

Now, I am trying to include different systematics into my analysis

std::map < std::string, TH1*> histogram_map {};
std::map < std::string, Double_t > syst_map{};
//declare and fill histogram map with some histograms
//open output file
//create folders for the different systematics
//save empty histograms inside the folders (same histograms as before, just for each systematic now)
(loop over events in file){
    //calculate some reconstructed objects
    for( auto iter : syst_map ){
         // calculate event weight for the systematic
         OpenFile->cd(iter.first.c_str());
         TDirectory *dir = gDirectory();
          //loop over the keys and replace the histogram_map histograms with the histograms (with n events) saved in the folder by doing
          TKey *key;
          while ( ( key = ( TKey* ) next() ) ){
              std::string get_name( key->GetName() );
              if( std::strncmp( key->GetClassName(), "TH1F", 4 ) == 0 )
                   histogram_map[get_name] = (TH1F*)dir->Get ( key->GetName() ) );
               //same for 2D, but with TH2F
          }
         //fill histogram_map histograms with the new event
         //loop over the histogram map and save the histograms again (n + 1 entries):
        (loop over histogram map)
               hist_iter.second->Write( iter->first.c_str(), TObject::kOverwrite );
    }
}
close file

Unfortunately, jobs submitted using this method get killed quickly due to reaching their used memory limit. I have found a workaround, which is closing the file after saving the empty histograms, and then opening and closing it inside the systematics loop every time, but this results in the script being painfully slow. I am able to submit separate jobs for separate systematics and hadd the files later on, but I would like to do all of the systematics at once, since this way I am doing the “calculate some reconstructed objects” step only once per event, and the number of submitted jobs will be a lot fewer (although a lot slower). Any suggestions are welcome.
Thank you,
Vangi

Well, if you keep all histograms in memory, you might well hit the limit. Not sure what would be the best solution here… Maybe @eguiraud has an idea?

hello, @bellenot, thank you for your reply. I am not a ROOT/C++ expert, but I was hoping that the only histograms staying in the memory are the ones in the histogram_map, for example “electron_pt”, and that gets changed to “electron_pt” for the appropriate systematic. The number of events for each systematic should be the same as the number of events without them (first version of the script), so I was hoping that nothing in addition was being kept in the memory, just replaced by the appropriate histogram read off from the file - please, correct me if I am wrong.

Looking at the code, it looks to me that all histogram (here TH1F) are being read and kept in the std::map. Or did I miss something?

They are, but they were being stored in the first version as well. After that, aren’t they replaced by another histogram? E.g. if I have the histogram_map[“electron_pt”] for the sytematic “weight_pileup_UP”, I save that in the “weight_pileup_UP” folder, then on the next systematic, “weight_pileup_DOWN”, isn’t the map entry for histogram_map[“electron_pt”] replaced by that “electron_pt”, filled with the event entry, then stored again inside the “weight_pileup_DOWN” folder. So the histogram pair to the “electron_pt” string is different, but still the same size as before, and the corresponding histogram associated with “weight_pileup_UP” should hopefully no longer be stored in the memory - at least this is what I am trying to do. In the initial script, I wasn’t saving the histograms inside a loop, I was storing them in memory (same size histogram_map - again, hopefully) and saving at the end of the script without any memory issues.

I might be wrong (not sure how std::map handle this), but if you replace a pointer value by another one, that doesn’t mean the object pointed by the first one gets deleted. You should call delete on the first pointer value before replacing it, so you first delete the old object and then replace it with the new one…