Hunting the memory leak with valgrind

Hi all,

I’m trying to find out where is memory leak in one analysis application for PHENIX. But I am not able to ‘decode’ loss reports which are connected directly with ROOT. Could anybody give me some hints where to look for and what for?

I have started application with

valgrind --tool=memcheck --num-callers=20 --leak-check=yes --show-reachable=yes --logfile=with_supp_reachable.log --suppressions=$ROOTSYS/root.supp root.exe -b ‘SpinAna.C(88869,10)’

The largest possible:

==18787== 30913344 bytes in 3608 blocks are still reachable in loss record 476 of 476
==18787== at 0x1B90354C: malloc (vg_replace_malloc.c:130)
==18787== by 0x1C00E6DC: G__search_tagname (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFD3999: G__get_linked_tagnum (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1C057561: G__cpp_setup_tagtableG__stream (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1C057709: G__cpp_setupG__stream (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1C0047D2: G__pragma (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFE6DDA: G__keyword_anytime_7 (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFEB532: G__exec_statement (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFCC325: G__loadfile (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFC9B30: G__include_file (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFEC6B5: G__exec_statement (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1C00F1A6: G__define_struct (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFEC70C: G__exec_statement (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFCC325: G__loadfile (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFC9B30: G__include_file (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFEC6B5: G__exec_statement (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BF87AB2: G__exec_tempfile_core (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BF87C8F: G__exec_tempfile_fp (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BFF5617: G__process_cmd (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==18787== by 0x1BA8F8A4: TCint::ProcessLine(char const*, TInterpreter::EErrorCode*) (TCint.cxx:310)

Thank you, best regards,

Antonin

with_supp_reachable.log.pid18787.txt (955 KB)

Hi.

I’m not good with VALGRIND etc., but:
instead of this terrible log can you, please., give YOUR minimal code wihich will give the same diagnostic with valgrind(about leak), but not so huge log? There are 467 records, and all refer to YOUR code, how can somebody understand, what’s wrong without source?

Please, give minimal compilable code with the same error.

What about your ROOT version?

==18787== 30913344 bytes in 3608 blocks are still reachable in loss record 476 of 476 ==18787== at 0x1B90354C: malloc (vg_replace_malloc.c:130) ==18787== by 0x1C00E6DC: G__search_tagname (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)This is actually not a leak but a lack of cleanup at the end of the process. We are working on some significant modification of CINT and will try to reduce those valgrind report as much as possible.

Cheers,
Philippe.

[quote=“tpochep”]There are 467 records, and all refer to YOUR code, how can somebody understand, what’s wrong without source?

Please, give minimal compilable code with the same error.[/quote]

You are obsolutely right (on the other hand, many of them come directly from the ROOT). But, if I would have such minimal code, I will be probably able to see the core of the problem.

I am not author of the code, I only have application with problem and I’m trying to find out where the problem is. :frowning:

The most interesting leak report seems to be the follwing one:

[size=67]
==15612== 3608000 bytes in 3608 blocks are still reachable in loss record 465 of 466
==15612== at 0x1B90354C: malloc (vg_replace_malloc.c:130)
==15612== by 0x1C00E615: G__search_tagname (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==15612== by 0x1BFD3999: G__get_linked_tagnum (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==15612== by 0x41157AB0: G__cpp_setup_tagtablePgPostMvdCalibBank_dict (PgPostMvdCalibBank_dict.C:733)
==15612== by 0x41157B09: G__cpp_setupPgPostMvdCalibBank_dict (PgPostMvdCalibBank_dict.C:740)
==15612== by 0x1BFC5B57: G__call_setup_funcs (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==15612== by 0x41157B9F: __static_initialization_and_destruction_0(int, int) (PgPostMvdCalibBank_dict.C:757)
==15612== by 0x41157D55: _GLOBAL__I__ZN4ROOT20GenerateInitInstanceEPK18PgPostMvdCalibBank (TVectorProxy.h:37)
==15612== by 0x41169AB0: (within /afs/rhic.bnl.gov/phenix/PHENIX_LIB/sys/i386_sl301/pro.59/lib/libPgCal.so.0.0.0)
==15612== by 0x41016BC8: (within /afs/rhic.bnl.gov/phenix/PHENIX_LIB/sys/i386_sl301/pro.59/lib/libPgCal.so.0.0.0)
==15612== by 0x1B8F0AE0: _dl_init (in /lib/ld-2.3.2.so)
==15612== by 0x1CB5C901: dl_open_worker (in /lib/tls/libc-2.3.2.so)
==15612== by 0x1B8F0895: _dl_catch_error (in /lib/ld-2.3.2.so)
==15612== by 0x1CB5C371: _dl_open (in /lib/tls/libc-2.3.2.so)
==15612== by 0x1C955D3A: dlopen_doit (in /lib/libdl-2.3.2.so)
==15612== by 0x1B8F0895: _dl_catch_error (in /lib/ld-2.3.2.so)
==15612== by 0x1C9554B5: _dlerror_run (in /lib/libdl-2.3.2.so)
==15612== by 0x1C955CE1: dlopen@GLIBC_2.0 (in /lib/libdl-2.3.2.so)
==15612== by 0x1C007C07: G__dlopen (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==15612== by 0x1C00826F: G__shl_load (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
[/size]

I looks to me as nonfreed collection somewhere. But I don’t know how to find the place, where it has been created in the code.

Root is Version 4.01/02 1 December 2004

Thank you

[size=67]
==28211== 1457632 bytes in 3608 blocks are still reachable in loss record 467 of 470
==28211== at 0x1B90354C: malloc (vg_replace_malloc.c:130)
==28211== by 0x1C00E8A6: G__search_tagname (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BFD3999: G__get_linked_tagnum (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x41146C02: G__cpp_setup_tagtablePgPostCoordinateBank_dict (PgPostCoordinateBank_dict.C:735)
==28211== by 0x41146C39: G__cpp_setupPgPostCoordinateBank_dict (PgPostCoordinateBank_dict.C:740)
==28211== by 0x1BFC5B57: G__call_setup_funcs (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x41146CCF: __static_initialization_and_destruction_0(int, int) (PgPostCoordinateBank_dict.C:757)
==28211== by 0x41146E85: _GLOBAL__I__ZN4ROOT20GenerateInitInstanceEPK20PgPostCoordinateBank (TVectorProxy.h:37)
==28211== by 0x41169AB0: (within /afs/rhic.bnl.gov/phenix/PHENIX_LIB/sys/i386_sl301/pro.59/lib/libPgCal.so.0.0.0)
==28211== by 0x41016BC8: (within /afs/rhic.bnl.gov/phenix/PHENIX_LIB/sys/i386_sl301/pro.59/lib/libPgCal.so.0.0.0)
==28211== by 0x1B8F0AE0: _dl_init (in /lib/ld-2.3.2.so)
==28211== by 0x1CB5C901: dl_open_worker (in /lib/tls/libc-2.3.2.so)
==28211== by 0x1B8F0895: _dl_catch_error (in /lib/ld-2.3.2.so)
==28211== by 0x1CB5C371: _dl_open (in /lib/tls/libc-2.3.2.so)
==28211== by 0x1C955D3A: dlopen_doit (in /lib/libdl-2.3.2.so)
==28211== by 0x1B8F0895: _dl_catch_error (in /lib/ld-2.3.2.so)
==28211== by 0x1C9554B5: _dlerror_run (in /lib/libdl-2.3.2.so)
==28211== by 0x1C955CE1: dlopen@GLIBC_2.0 (in /lib/libdl-2.3.2.so)
==28211== by 0x1C007C07: G__dlopen (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1C00826F: G__shl_load (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BFCC205: G__loadfile (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BA8F741: TCint::Load(char const*, bool) (TCint.cxx:269)
==28211== by 0x1BA5D369: TSystem::Load(char const*, char const*, bool) (TSystem.cxx:1303)
==28211== by 0x1BB17F07: TUnixSystem::Load(char const*, char const*, bool) (TUnixSystem.cxx:2058)
==28211== by 0x1BC21F48: G__G__Base2_246_6_9(G__value*, char const*, G__param*, int) (G__Base2.cxx:20276)
==28211== by 0x1BFD14E9: G__call_cppfunc (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BFC0CB2: G__interpret_func (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BFA88B0: G__getfunction (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1C02D03F: G__getstructmem (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1C027158: G__getvariable (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BF9FE45: G__getitem (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BF9EA5D: G__getexpr (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BFE6741: G__exec_function (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BFED38E: G__exec_statement (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BFC2402: G__interpret_func (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BFA8EB5: G__getfunction (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BF9FE75: G__getitem (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BF9EA5D: G__getexpr (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BF96ECD: G__calc_internal (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BFF3055: G__process_cmd (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCint.so)
==28211== by 0x1BA8F8A4: TCint::ProcessLine(char const*, TInterpreter::EErrorCode*) (TCint.cxx:310)
==28211== by 0x1BA8F9B1: TCint::ProcessLineSynch(char const*, TInterpreter::EErrorCode*) (TCint.cxx:356)
==28211== by 0x1B9F9077: TApplication::ProcessFile(char const*, int*) (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCore.so)
==28211== by 0x1B9F87B6: TApplication::ProcessLine(char const*, bool, int*) (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libCore.so)
==28211== by 0x1C92E780: TRint::Run(bool) (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/lib/libRint.so)
==28211== by 0x80488DC: main (in /afs/rhic.bnl.gov/@sys/opt/phenix/root-4.01.02/bin/root.exe)
[/size]

The report you are focusing on is the one I was talking about. It is actually not a leak but a lack of cleanup (on the CINT end of things) at the end of the process. We are working on some significant modification of CINT and will try to reduce those valgrind report as much as possible.

For now, please ignore this valgrind report (and maybe add this to the list of things valgrind should ignore!)

Cheers,
Philippe.