TTree GetEntry memory problem

Dear developers,
I have a memory problem while looping on a TTree.
This has shown up only after a migration from SLC6
to Centos7 on a local farm (but is reproducible on
Lxplus) and only while reading specific files.

On lxplus I was checking using different root version
and different compilers and the effect doesn’t change.
For the last test I used root 6.18 with the default compiler

/cvmfs/sft.cern.ch/lcg/app/releases/ROOT/6.18.04/x86_64-centos7-gcc48-opt/bin/thisroot.csh 

a root file creating this problem is at (not created by me, even if I’m helping with the problem)

/afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/rd_p09mup_s5_bs2_excl.root

The very simple program I used to spot the problem is hereafter,
compiled with

g++ `root-config --cflags --libs` -o test.exe test.C

Thank you any help in debugging.
Andrea Bressan,
PS: I tried
valgrind -v --tool=memcheck --leak-check=yes --num-callers=40 ./test.exe
but with limited insight.

/afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/test.C

#include <vector>
#include <iostream>
#include <fstream>
#include <TFile.h>
#include "TROOT.h"
#include "TMath.h"
#include "TLeaf.h"
#include "TChain.h"


using namespace std;


int main(){

  //////////////////////////////////////////////////////////////////////////////////////////////////////////////////
  ulong all=0;
  TChain USR1_dis("USR1_dis");
  TChain USR1_had("USR1_had");
  USR1_dis.Add("/afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/rd_p09mup_s5_bs2_excl.root");
  USR1_had.Add("/afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/rd_p09mup_s5_bs2_excl.root");
  int rd = USR1_dis.GetEntries(); cout << rd << endl;
  int ra = USR1_had.GetEntries(); cout << ra << endl;
  int en1,en2;
  for(ulong j=0; j<ra; j++){
    en1=USR1_had.GetEntry(j);
    for(ulong i=0; i<rd; i++){
      en2=USR1_dis.GetEntry(i); 
      all++; if(all%1000000==0) cout << all << "  " <<  en2 << " " << en1 << endl;
    }
  }

  return 0;
}

Please read tips for efficient and successful posting and posting code

_ROOT Version:6.18
_Platform:centos7
Compiler: gcc4.8


What “memory problem”?

When running “valgrind” with ROOT, you should use:

valgrind --suppressions=`root-config --etcdir`/valgrind-root.supp ...

BTW. Please read the “Tips for Efficient and Successful Posting” and the “Posting code? Read this first!”.

On the farm where ulimts are set for users it fails with a bad_alloc,
on other systems it goes up to 15 GB ram usage. On SLC6 it stays
constant well below a GB. And that is the case on my trees as it should
be. GetEntry reuses the same space to my knowledge. The size returned
by the entry is not changing. I suspect is coming from reference to class
objects which are used to fill the tree, but I don’t understand the from where
difference may arise from centos7 and slc6.
I’ll recheck valgrind.

In order to get better debug messages, try with:

`root-config --cxx --cflags` -g -o test.exe test.C `root-config --libs`

Yes, I had -g when debugging (exactly same compilation with the libs at the end)

Just to get a better picture, try:
echo "gFile->GetVersion()" | root /.../rd_p09mup_s5_bs2_excl.root

Hi,
can you please try ROOT v6.20, sourced from LCG but with a more recent compiler than gcc 4.8, and *-dbg instead of *-opt so ROOT is compiled with debug symbols? You should add the --suppressions that @Wile_E_Coyote suggests, but use --tool=massif instead of memcheck. You code should also be compiled with -g.

That should give us a clear picture of where the allocations come from.

Cheers,
Enrico

lxplus7107.cern.ch:/afs/cern.ch/user/b/bressan/myscr<241>echo "gFile->GetVersion()" | root rd_p09mup_s5_bs2_excl.root 
   ------------------------------------------------------------------
  | Welcome to ROOT 6.20/02                        https://root.cern |
  | (c) 1995-2020, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Mar 15 2020, 15:25:34                 |
  | From tags/v6-20-02@v6-20-02                                      |
  | Try '.help', '.demo', '.license', '.credits', '.quit'/'.q'       |
   ------------------------------------------------------------------


Attaching file rd_p09mup_s5_bs2_excl.root as _file0...
(TFile *) 0x38c5560
(int) 53434

So, the file was created by ROOT 5.34/34.

Dear Enrico,
thanks. I tried both with gcc8.1/2 and gcc9 but I’m missing some libs. I do have a root.6.16 compiled from scratch by me with gcc6.3, not 20
245 15:25 source /cvmfs/sft.cern.ch/lcg/releases/LCG_97/ROOT/v6.20.02/x86_64-centos7-gcc9-dbg/bin/thisroot.csh /cvmfs/sft.cern.ch/lcg/releases/LCG_97/ROOT/v6.20.02/x86_64-centos7-gcc9-dbg
246 15:26 source /cvmfs/sft.cern.ch/lcg/releases/gcc/9.1.0.1-0b417/x86_64-centos7/setup.csh

lxplus794.cern.ch:/afs/cern.ch/user/b/bressan<248>root-config --cxx --cflags -g -o test.exe test.C root-config --libs
g++: error: test.C: No such file or directory
lxplus794.cern.ch:/afs/cern.ch/user/b/bressan<249>cd myscr/
lxplus794.cern.ch:/afs/cern.ch/user/b/bressan/myscr<250>root-config --cxx --cflags -g -o test.exe test.C root-config --libs
/cvmfs/sft.cern.ch/lcg/releases/binutils/2.30-e5b21/x86_64-centos7/bin/ld: warning: libvdt.so, needed by /cvmfs/sft.cern.ch/lcg/releases/ROOT/v6.20.02-10e75/x86_64-centos7-gcc9-dbg/lib/libROOTVecOps.so, not found (try using -rpath or -rpath-link)
/cvmfs/sft.cern.ch/lcg/releases/ROOT/v6.20.02-10e75/x86_64-centos7-gcc9-dbg/lib/libImt.so: undefined reference to tbb::interface7::internal::task_arena_base::internal_current_slot()' /cvmfs/sft.cern.ch/lcg/releases/ROOT/v6.20.02-10e75/x86_64-centos7-gcc9-dbg/lib/libImt.so: undefined reference to tbb::interface7::internal::isolate_within_arena(tbb::interface7::internal::delegate_base&, long)’
/cvmfs/sft.cern.ch/lcg/releases/ROOT/v6.20.02-10e75/x86_64-centos7-gcc9-dbg/lib/libImt.so: undefined reference to tbb::interface7::internal::task_arena_base::internal_terminate()' /cvmfs/sft.cern.ch/lcg/releases/ROOT/v6.20.02-10e75/x86_64-centos7-gcc9-dbg/lib/libImt.so: undefined reference to tbb::interface7::internal::task_arena_base::internal_initialize()’
/cvmfs/sft.cern.ch/lcg/releases/ROOT/v6.20.02-10e75/x86_64-centos7-gcc9-dbg/lib/libImt.so: undefined reference to `tbb::interface7::internal::task_arena_base::internal_execute(tbb::interface7::internal::delegate_base&) const’
collect2: error: ld returned 1 exit status

The path to get the full environment is, for instance, source /cvmfs/sft.cern.ch/lcg/views/LCG_97/x86_64-centos7-gcc8-dbg/setup.sh (note it’s views and not releases, which only gets you ROOT but not all dependencies).

Dear Enrico,
here is the log of
valgrind -v --suppressions=root-config --etcdir/valgrind-root.supp --tool=massif --num-callers=40 ./test.exe
top was in the meanwhile giving a constant 0.7% memory consumption.

the logs are in
/afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/massif.out.6199
I’m not able to load it in full.
Andrea

top was in the meanwhile giving a constant 0.7% memory consumption.

Doesn’t that mean that the problematic behavior was not triggered? Is it present in 6.18 and absent in 6.20, leaving everything else (compiler, dbg/opt, within/without valgrind) unchanged?

Dear Enrico,

For the moment it means that with 6.20, dbg and gcc9 was not present,
with and without valgrind.
I’ll check the rest and let you know. In any case thanks since there is a
working solution available.

Andrea

Ok, good! It is much more likely that the presence/absence of the bug is only a function of ROOT’s version than that it also depends on compiler or compilation flags (still possible, but much less likely).

@pcanal might know which changes have been introduced between 6.18 and 6.20 that could fix a high memory usage of TTree::GetEntry.

Cheers,
Enrico

If the file is not compressed, it could be https://github.com/root-project/root/commit/c691c52b4c60b1ff73a6b7ad29e9153d54f8c0d9

Do you have a massif output for the failing case?

I’ve check the compression level for that root tree and is ==1.
I was out a couple of days.

Hereafter the first 4 snapshots of the failing case.
bests

desc: (none)
cmd: ./test.exe
time_unit: i
#-----------
snapshot=0
#-----------
time=0
mem_heap_B=0
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=1
#-----------
time=4049289985
mem_heap_B=372725860
mem_heap_extra_B=818772
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=2
#-----------
time=10874905319
mem_heap_B=619017128
mem_heap_extra_B=833368
mem_stacks_B=0
heap_tree=detailed
n3: 619017128 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
n1: 498965494 0x503DD37: TBuffer::TBuffer(TBuffer::EMode, int) (TBuffer.cxx:85)
n1: 498965494 0x579BF67: TBufferIO::TBufferIO(TBuffer::EMode, int) (TBufferIO.cxx:51)
n4: 498965494 0x5796537: TBufferFile::TBufferFile(TBuffer::EMode, int) (TBufferFile.cxx:89)
n1: 402431759 0x71A6208: TBasket::ReadBasketBuffers(long long, int, TFile*) (TBasket.cxx:419)
n1: 402431759 0x71B0FFE: TBranch::GetBasketImpl(int, TBuffer*) (TBranch.cxx:1245)
n1: 402431759 0x71B1596: TBranch::GetBasketAndFirst(TBasket*&, long long&, TBuffer*) (TBranch.cxx:1362)
n1: 402431759 0x71B1C72: TBranch::GetEntry(long long, int) (TBranch.cxx:1589)
n2: 402431759 0x7222B9A: TTree::GetEntry(long long, int) (TTree.cxx:5439)
n0: 399711348 0x40115A: main (test.C:28)
n0: 2720411 in 1 place, below massif’s threshold (1.00%)
n1: 62900272 0x71AF677: TBranch::GetTransientBuffer(int) (TBranch.cxx:520)
n1: 62900272 0x71A7A34: TBasket::TBasket(char const*, char const*, TBranch*) (TBasket.cxx:78)
n1: 62900272 0x7216003: TTree::CreateBasket(TBranch*) (TTree.cxx:3578)
n1: 62900272 0x71B0A4A: TBranch::GetFreshBasket(TBuffer*) (TBranch.cxx:1806)
n1: 62900272 0x71B10F9: TBranch::GetBasketImpl(int, TBuffer*) (TBranch.cxx:1227)
n1: 62900272 0x71B1596: TBranch::GetBasketAndFirst(TBasket*&, long long&, TBuffer*) (TBranch.cxx:1362)
n1: 62900272 0x71B1C72: TBranch::GetEntry(long long, int) (TBranch.cxx:1589)
n2: 62900272 0x7222B9A: TTree::GetEntry(long long, int) (TTree.cxx:5439)
n0: 31801224 0x401127: main (test.C:26)
n0: 31099048 0x40115A: main (test.C:28)
n1: 33633463 0x71A64F7: TBasket::ReadBasketBuffers(long long, int, TFile*) (TBasket.cxx:419)
n1: 33633463 0x71B0FFE: TBranch::GetBasketImpl(int, TBuffer*) (TBranch.cxx:1245)
n1: 33633463 0x71B1596: TBranch::GetBasketAndFirst(TBasket*&, long long&, TBuffer*) (TBranch.cxx:1362)
n1: 33633463 0x71B1C72: TBranch::GetEntry(long long, int) (TBranch.cxx:1589)
n2: 33633463 0x7222B9A: TTree::GetEntry(long long, int) (TTree.cxx:5439)
n0: 30913052 0x40115A: main (test.C:28)
n0: 2720411 in 1 place, below massif’s threshold (1.00%)
n0: 0 in 4 places, all below massif’s threshold (1.00%)
n1: 93602106 0x57D9F91: TFileCacheRead::SetEnablePrefetchingImpl(bool) (TFileCacheRead.cxx:814)
n1: 93602106 0x57DA3E8: TFileCacheRead::TFileCacheRead(TFile*, int, TObject*) (TFileCacheRead.cxx:145)
n1: 93602106 0x720B622: TTreeCache::TTreeCache(TTree*, int) (TTreeCache.cxx:314)
n1: 93602106 0x7222A48: TTree::SetCacheSizeAux(bool, long long) (TTree.cxx:8467)
n1: 93602106 0x7224009: TTree::LoadTree(long long) (TTree.cxx:6257)
n1: 93602106 0x71DECBB: TChain::LoadTree(long long) (TChain.cxx:1321)
n2: 93602106 0x71DBE32: TChain::GetEntry(long long, int) (TChain.cxx:966)
n0: 47225203 0x401127: main (test.C:26)
n0: 46376903 0x40115A: main (test.C:28)
n0: 26449528 in 2828 places, all below massif’s threshold (1.00%)
#-----------
snapshot=3
#-----------
time=19302391483
mem_heap_B=926226400
mem_heap_extra_B=876496
mem_stacks_B=0
heap_tree=detailed
n3: 926226400 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
n1: 805881471 0x503DD37: TBuffer::TBuffer(TBuffer::EMode, int) (TBuffer.cxx:85)
n1: 805881471 0x579BF67: TBufferIO::TBufferIO(TBuffer::EMode, int) (TBufferIO.cxx:51)
n4: 805881471 0x5796537: TBufferFile::TBufferFile(TBuffer::EMode, int) (TBufferFile.cxx:89)
n1: 709347736 0x71A6208: TBasket::ReadBasketBuffers(long long, int, TFile*) (TBasket.cxx:419)
n1: 709347736 0x71B0FFE: TBranch::GetBasketImpl(int, TBuffer*) (TBranch.cxx:1245)
n1: 709347736 0x71B1596: TBranch::GetBasketAndFirst(TBasket*&, long long&, TBuffer*) (TBranch.cxx:1362)
n1: 709347736 0x71B1C72: TBranch::GetEntry(long long, int) (TBranch.cxx:1589)
n2: 709347736 0x7222B9A: TTree::GetEntry(long long, int) (TTree.cxx:5439)
n0: 706627325 0x40115A: main (test.C:28)
n0: 2720411 in 1 place, below massif’s threshold (1.00%)
n1: 62900272 0x71AF677: TBranch::GetTransientBuffer(int) (TBranch.cxx:520)
n1: 62900272 0x71A7A34: TBasket::TBasket(char const*, char const*, TBranch*) (TBasket.cxx:78)
n1: 62900272 0x7216003: TTree::CreateBasket(TBranch*) (TTree.cxx:3578)
n1: 62900272 0x71B0A4A: TBranch::GetFreshBasket(TBuffer*) (TBranch.cxx:1806)
n1: 62900272 0x71B10F9: TBranch::GetBasketImpl(int, TBuffer*) (TBranch.cxx:1227)
n1: 62900272 0x71B1596: TBranch::GetBasketAndFirst(TBasket*&, long long&, TBuffer*) (TBranch.cxx:1362)
n1: 62900272 0x71B1C72: TBranch::GetEntry(long long, int) (TBranch.cxx:1589)
n2: 62900272 0x7222B9A: TTree::GetEntry(long long, int) (TTree.cxx:5439)
n0: 31801224 0x401127: main (test.C:26)
n0: 31099048 0x40115A: main (test.C:28)
n1: 33633463 0x71A64F7: TBasket::ReadBasketBuffers(long long, int, TFile*) (TBasket.cxx:419)
n1: 33633463 0x71B0FFE: TBranch::GetBasketImpl(int, TBuffer*) (TBranch.cxx:1245)
n1: 33633463 0x71B1596: TBranch::GetBasketAndFirst(TBasket*&, long long&, TBuffer*) (TBranch.cxx:1362)
n1: 33633463 0x71B1C72: TBranch::GetEntry(long long, int) (TBranch.cxx:1589)
n2: 33633463 0x7222B9A: TTree::GetEntry(long long, int) (TTree.cxx:5439)
n0: 30913052 0x40115A: main (test.C:28)
n0: 2720411 in 1 place, below massif’s threshold (1.00%)
n0: 0 in 4 places, all below massif’s threshold (1.00%)
n1: 93602106 0x57D9F91: TFileCacheRead::SetEnablePrefetchingImpl(bool) (TFileCacheRead.cxx:814)
n1: 93602106 0x57DA3E8: TFileCacheRead::TFileCacheRead(TFile*, int, TObject*) (TFileCacheRead.cxx:145)
n1: 93602106 0x720B622: TTreeCache::TTreeCache(TTree*, int) (TTreeCache.cxx:314)
n1: 93602106 0x7222A48: TTree::SetCacheSizeAux(bool, long long) (TTree.cxx:8467)
n1: 93602106 0x7224009: TTree::LoadTree(long long) (TTree.cxx:6257)
n1: 93602106 0x71DECBB: TChain::LoadTree(long long) (TChain.cxx:1321)
n2: 93602106 0x71DBE32: TChain::GetEntry(long long, int) (TChain.cxx:966)
n0: 47225203 0x401127: main (test.C:26)
n0: 46376903 0x40115A: main (test.C:28)
n0: 26742823 in 2828 places, all below massif’s threshold (1.00%)
#-----------
snapshot=4
#-----------
time=25169204130
mem_heap_B=1136265048
mem_heap_extra_B=914168
mem_stacks_B=0
heap_tree=detailed
n3: 1136265048 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
n1: 1015668439 0x503DD37: TBuffer::TBuffer(TBuffer::EMode, int) (TBuffer.cxx:85)
n1: 1015668439 0x579BF67: TBufferIO::TBufferIO(TBuffer::EMode, int) (TBufferIO.cxx:51)
n4: 1015668439 0x5796537: TBufferFile::TBufferFile(TBuffer::EMode, int) (TBufferFile.cxx:89)
n1: 920720888 0x71A6208: TBasket::ReadBasketBuffers(long long, int, TFile*) (TBasket.cxx:419)
n1: 920720888 0x71B0FFE: TBranch::GetBasketImpl(int, TBuffer*) (TBranch.cxx:1245)
n1: 920720888 0x71B1596: TBranch::GetBasketAndFirst(TBasket*&, long long&, TBuffer*) (TBranch.cxx:1362)
n1: 920720888 0x71B1C72: TBranch::GetEntry(long long, int) (TBranch.cxx:1589)
n2: 920720888 0x7222B9A: TTree::GetEntry(long long, int) (TTree.cxx:5439)
n0: 918000477 0x40115A: main (test.C:28)
n0: 2720411 in 1 place, below massif’s threshold (1.00%)
n1: 62900272 0x71AF677: TBranch::GetTransientBuffer(int) (TBranch.cxx:520)
n1: 62900272 0x71A7A34: TBasket::TBasket(char const*, char const*, TBranch*) (TBasket.cxx:78)
n1: 62900272 0x7216003: TTree::CreateBasket(TBranch*) (TTree.cxx:3578)
n1: 62900272 0x71B0A4A: TBranch::GetFreshBasket(TBuffer*) (TBranch.cxx:1806)
n1: 62900272 0x71B10F9: TBranch::GetBasketImpl(int, TBuffer*) (TBranch.cxx:1227)
n1: 62900272 0x71B1596: TBranch::GetBasketAndFirst(TBasket*&, long long&, TBuffer*) (TBranch.cxx:1362)
n1: 62900272 0x71B1C72: TBranch::GetEntry(long long, int) (TBranch.cxx:1589)
n2: 62900272 0x7222B9A: TTree::GetEntry(long long, int) (TTree.cxx:5439)
n0: 31801224 0x401127: main (test.C:26)
n0: 31099048 0x40115A: main (test.C:28)
n1: 32047279 0x71A64F7: TBasket::ReadBasketBuffers(long long, int, TFile*) (TBasket.cxx:419)
n1: 32047279 0x71B0FFE: TBranch::GetBasketImpl(int, TBuffer*) (TBranch.cxx:1245)
n1: 32047279 0x71B1596: TBranch::GetBasketAndFirst(TBasket*&, long long&, TBuffer*) (TBranch.cxx:1362)
n1: 32047279 0x71B1C72: TBranch::GetEntry(long long, int) (TBranch.cxx:1589)
n2: 32047279 0x7222B9A: TTree::GetEntry(long long, int) (TTree.cxx:5439)
n0: 29326868 0x40115A: main (test.C:28)
n0: 2720411 in 1 place, below massif’s threshold (1.00%)
n0: 0 in 4 places, all below massif’s threshold (1.00%)
n1: 93602106 0x57D9F91: TFileCacheRead::SetEnablePrefetchingImpl(bool) (TFileCacheRead.cxx:814)
n1: 93602106 0x57DA3E8: TFileCacheRead::TFileCacheRead(TFile*, int, TObject*) (TFileCacheRead.cxx:145)
n1: 93602106 0x720B622: TTreeCache::TTreeCache(TTree*, int) (TTreeCache.cxx:314)
n1: 93602106 0x7222A48: TTree::SetCacheSizeAux(bool, long long) (TTree.cxx:8467)
n1: 93602106 0x7224009: TTree::LoadTree(long long) (TTree.cxx:6257)
n1: 93602106 0x71DECBB: TChain::LoadTree(long long) (TChain.cxx:1321)
n2: 93602106 0x71DBE32: TChain::GetEntry(long long, int) (TChain.cxx:966)
n0: 47225203 0x401127: main (test.C:26)
n0: 46376903 0x40115A: main (test.C:28)
n0: 26994503 in 2828 places, all below massif’s threshold (1.00%)
#-----------

Humm this is odd. if you give me access (pcanal) to the file, I will try to reproduce the problem.

Cheers,
Philippe.

Dear Philippe,
I didn’t have administer ACLs on that temporary space, so I had to ask.
Now read rights are set for you on
/afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/rootprobs
and files therein.

Thanks for the help on this issue.
Andrea

This is still failing for me:

-bash-4.2$ ls /afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/rd_p09mup_s5_bs2_excl.root
ls: cannot access /afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/rd_p09mup_s5_bs2_excl.root: Permission denied
-bash-4.2$ ls /afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/rootprobs
ls: cannot access /afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/rootprobs: Permission denied
-bash-4.2$ ls /afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/
ls: cannot access /afs/cern.ch/compass/scratch/monthly/d01_del1/bressan/: Permission denied
-bash-4.2$ ls /afs/cern.ch/compass/scratch/monthly/d01_del1
ls: cannot open directory /afs/cern.ch/compass/scratch/monthly/d01_del1: Permission denied
-bash-4.2$ ls /afs/cern.ch/compass/scratch/monthly/
d01_del1  d02_del1  d03_del1  d04_del8  d05_del8  d06_del8  d07_del15  d08_del15  d09_del15  d10_del22  d11_del22  d12_del22  RulesToClean.txt