Memory usage using files on eos

Hi,

I am having troubles opening a list of files in a loop.
At a certain point it crashes with

terminate called after throwing an instance of 'St9bad_alloc'
  what():  std::bad_alloc

I tried to debug and see which was the memory consumption.
I realized that after the I open the file the memory consumption increases (and that’s ok)
but when I close and delete the file it doesn’t decrease.
My code is simply doing:

//  vector<TString>  fileList  is a list on file in eos
  for(int f=0; f<fileList.size(); f++){
    
    cout << "Processing file: " << fileList.at(f) << endl; 

    TString fileN = fileList.at(f);
    cout << " Opening..." << endl; 
    TFile *ft = TFile::Open( fileN );
 
    cout << "File open " << endl;
    
    cout << "Memory AFTER Open: kB " <<   getValue() << endl;
    ft->Close();
    delete ft;
    cout << "Memory AFTER delete: kB " <<   getValue() << endl;
}

the output of this code is:

Processing file: root://eosatlas//eos/atlas/atlasgroupdisk/det-ibl/rucio/group/det-ibl/fb/d5/group.det-ibl.343003_001780.EXT1._00001.NTUP.root
Virtual Memory used by this process: kB 146456
 Opening...
File open 
Memory AFTER Open: kB 350616
Memory AFTER delete: kB 350616
Processing file: root://eosatlas//eos/atlas/atlasgroupdisk/det-ibl/rucio/group/det-ibl/18/33/group.det-ibl.343003_001780.EXT1._00002.NTUP.root
Virtual Memory used by this process: kB 350616
 Opening...
File open 
Memory AFTER Open: kB 426528
Memory AFTER delete: kB 426428
Processing file: root://eosatlas//eos/atlas/atlasgroupdisk/det-ibl/rucio/group/det-ibl/4e/2e/group.det-ibl.343003_001780.EXT1._00003.NTUP.root.1
Virtual Memory used by this process: kB 426428
 Opening...
File open 
Memory AFTER Open: kB 502208
Memory AFTER delete: kB 502208
Processing file: root://eosatlas//eos/atlas/atlasgroupdisk/det-ibl/rucio/group/det-ibl/c3/98/group.det-ibl.343003_001780.EXT1._00004.NTUP.root
Virtual Memory used by this process: kB 502208
 Opening...
File open 
Memory AFTER Open: kB 577988
Memory AFTER delete: kB 577988

the code is a compiled c++ executable.
why after that I do
ft->Close();
delete ft;

the memory usage doesn’t become smaller?

cheers,
delo

these are the function I used to get the memory usage


int parseLine(char* line){
  int i = strlen(line);
  while (*line < '0' || *line > '9') line++;
  line[i-3] = '\0';
  i = atoi(line);
  return i;
}


int getValue(){ //Note: this value is in KB!
  FILE* file = fopen("/proc/self/status", "r");
  int result = -1;
  char line[128];
  
  
  while (fgets(line, 128, file) != NULL){
    if (strncmp(line, "VmSize:", 7) == 0){
      result = parseLine(line);
      break;
    }
  }
  fclose(file);
  return result;
}

Hi,

is there anybody who has an hint on this?
I really believe that TFile::Open() has memory leak when using xrootd plugin which makes my jobs crash.

cheers,
Francesco

[quote=“delo”]Hi,

is there anybody who has an hint on this?
I really believe that TFile::Open() has memory leak when using xrootd plugin which makes my jobs crash.

cheers,
Francesco[/quote]

Unfortunately, it’s quite difficult to give you any advice without any actual code reproducing the problem.
The code snippets you’ve demonstrated are quite useless. The fact that memory in not released immediately and you have the same before/after means nothing, since system memory management is quite a complicated thing and if you call ‘delete’ it does not mean you immediately see the memory free with some tool. Most probably, your code (which you omitted) contains a memory leak, and quite a serious one if you managed to get an std::bad_alloc.

EDIT: ok, I see, you said this code is enough to get an exception.

Hi,

I got the confirmation from storage experts that the memory leak with xrootd is a known problem.

The code I am using is nothing more than what is in the snippet. If there is a memory leak it has to be in those lines. (I cannot provide running code because I can’t grant access on eos)

However the memory usage as I described is indeed very informative. When running on local files (when xrootd plugin is not used) it shows the expected behavior (as described in other discussions). The memory usage increases after the first file and in the following iteration there is no increase in the usage.

D

Hi

We are looking at a similar problem elsewhere, where it has been suggested, by the xrootd developers, that it is related to memory allocation issues in SL6.

Could you try setting
MALLOC_ARENA_MAX=1
and see if this also alleviates the problem you see.

Looks like this patch also needs to have been applied on the machine for this to work but I believe this is applied on lxplus machines
rhn.redhat.com/errata/RHSA-2012-0058.html

Cheers

Wahid