Home | News | Documentation | Download

Issue identifying source of memleak


ROOT Version: 6.20/05
Platform: Ubuntu 20.04 x86_64
Compiler: GCC 9.3.0


Hi,

I’m experiencing some memory leak in my code, slowly building up over the run time. When I use valgrind to identify the leak sources, the most severe one is listed as follows:

==16302== 4,241,664 bytes in 2,104 blocks are possibly lost in loss record 13,789 of 13,790
==16302==    at 0x483BE63: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==16302==    by 0x154CE2CA: clang::ModuleMap::findOrCreateModule(llvm::StringRef, clang::Module*, bool, bool) (in /opt/root/lib/libCling.so)
==16302==    by 0x154DBEAF: clang::ModuleMapParser::parseModuleDecl() (in /opt/root/lib/libCling.so)
==16302==    by 0x154DBF4F: clang::ModuleMapParser::parseModuleDecl() (in /opt/root/lib/libCling.so)
==16302==    by 0x154DCADF: clang::ModuleMapParser::parseModuleMapFile() (in /opt/root/lib/libCling.so)
==16302==    by 0x154DCECB: clang::ModuleMap::parseModuleMapFile(clang::FileEntry const*, bool, clang::DirectoryEntry const*, clang::FileID, unsigned int*, clang::SourceLocation) (in /opt/root/lib/libCling.so)
==16302==    by 0x154A0D78: clang::HeaderSearch::loadModuleMapFileImpl(clang::FileEntry const*, bool, clang::DirectoryEntry const*, clang::FileID, unsigned int*) (in /opt/root/lib/libCling.so)
==16302==    by 0x154A1C58: clang::HeaderSearch::loadModuleMapFile(clang::FileEntry const*, bool, clang::FileID, unsigned int*, llvm::StringRef) (in /opt/root/lib/libCling.so)
==16302==    by 0x132DCCD2: (anonymous namespace)::createCIImpl(std::unique_ptr<llvm::MemoryBuffer, std::default_delete<llvm::MemoryBuffer> >, cling::CompilerOptions const&, char const*, std::unique_ptr<clang::ASTConsumer, std::default_delete<clang::ASTConsumer> >, std::vector<std::shared_ptr<clang::ModuleFileExtension>, std::allocator<std::shared_ptr<clang::ModuleFileExtension> > > const&, bool, bool) (in /opt/root/lib/libCling.so)
==16302==    by 0x132DE67F: cling::CIFactory::createCI(llvm::StringRef, cling::InvocationOptions const&, char const*, std::unique_ptr<clang::ASTConsumer, std::default_delete<clang::ASTConsumer> >, std::vector<std::shared_ptr<clang::ModuleFileExtension>, std::allocator<std::shared_ptr<clang::ModuleFileExtension> > > const&) (in /opt/root/lib/libCling.so)
==16302==    by 0x13384330: cling::IncrementalParser::IncrementalParser(cling::Interpreter*, char const*, std::vector<std::shared_ptr<clang::ModuleFileExtension>, std::allocator<std::shared_ptr<clang::ModuleFileExtension> > > const&) (in /opt/root/lib/libCling.so)
==16302==    by 0x133048C2: cling::Interpreter::Interpreter(int, char const* const*, char const*, std::vector<std::shared_ptr<clang::ModuleFileExtension>, std::allocator<std::shared_ptr<clang::ModuleFileExtension> > > const&, bool, cling::Interpreter const*) (in /opt/root/lib/libCling.so)
==16302== 
==16302== LEAK SUMMARY:
==16302==    definitely lost: 66,612 bytes in 482 blocks
==16302==    indirectly lost: 1,143,862 bytes in 608 blocks
==16302==      possibly lost: 12,603,877 bytes in 9,909 blocks
==16302==    still reachable: 4,085,245 bytes in 37,394 blocks
==16302==                       of which reachable via heuristic:
==16302==                         newarray           : 37,648 bytes in 66 blocks
==16302==                         multipleinheritance: 47,352 bytes in 4 blocks
==16302==         suppressed: 22,764,278 bytes in 26,407 blocks
==16302== Reachable blocks (those to which a pointer was found) are not shown.
==16302== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==16302== 

Unfortunately I’m completely clueless where to look - does anyone maybe have an idea for me where I could trigger something like this in my code?

Cheers,
SImon

Forgot to mention, I’m using the suppression file for valgrind shipped with ROOT.

@vvassilev or @Axel can you help?

This looks like the accumulation of the JITed code. To find what is jitting code, use the valgrind option --num-callers=50

Cheers,
Philippe.

Hi @pcanal

thanks, that already got me a bit further - it happens when filling trees:

==586986== 393,228 bytes in 1 blocks are possibly lost in loss record 18,582 of 18,594
==586986==    at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==586986==    by 0x15710EA2: llvm::StringMapImpl::RehashTable(unsigned int) (in /opt/root/lib/libCling.so)
==586986==    by 0x13960F0B: clang::ASTReader::DecodeIdentifierInfo(unsigned int) (in /opt/root/lib/libCling.so)
==586986==    by 0x1396F675: clang::ASTReader::ReadDeclarationName(clang::serialization::ModuleFile&, llvm::SmallVector<unsigned long, 64u> const&, unsigned int&) (in /opt/root/lib/libCling.so)
==586986==    by 0x13994994: clang::ASTDeclReader::VisitDeclaratorDecl(clang::DeclaratorDecl*) (in /opt/root/lib/libCling.so)
==586986==    by 0x139A24C5: clang::ASTDeclReader::VisitFunctionDecl(clang::FunctionDecl*) (in /opt/root/lib/libCling.so)
==586986==    by 0x139A3858: clang::ASTDeclReader::VisitCXXMethodDecl(clang::CXXMethodDecl*) (in /opt/root/lib/libCling.so)
==586986==    by 0x139AC672: clang::ASTDeclReader::Visit(clang::Decl*) (in /opt/root/lib/libCling.so)
==586986==    by 0x139ACC30: clang::ASTReader::ReadDeclRecord(unsigned int) (in /opt/root/lib/libCling.so)
==586986==    by 0x13931C45: clang::ASTReader::GetDecl(unsigned int) [clone .part.0] (in /opt/root/lib/libCling.so)
==586986==    by 0x13960087: clang::ASTReader::FindExternalLexicalDecls(clang::DeclContext const*, llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl*>&) (in /opt/root/lib/libCling.so)
==586986==    by 0x13A6A349: clang::MultiplexExternalSemaSource::FindExternalLexicalDecls(clang::DeclContext const*, llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl*>&) (in /opt/root/lib/libCling.so)
==586986==    by 0x1522FAF7: clang::DeclContext::LoadLexicalDeclsFromExternalStorage() const (in /opt/root/lib/libCling.so)
==586986==    by 0x1522FC8C: clang::DeclContext::decls_begin() const (in /opt/root/lib/libCling.so)
==586986==    by 0x131E10A4: TClingDataMemberInfo::TClingDataMemberInfo(cling::Interpreter*, TClingClassInfo*) (in /opt/root/lib/libCling.so)
==586986==    by 0x131883F0: TCling::DataMemberInfo_Factory(ClassInfo_t*) const (in /opt/root/lib/libCling.so)
==586986==    by 0xE638F14: TListOfDataMembers::Load() (in /opt/root/lib/libCore.so)
==586986==    by 0xE610C32: TClass::GetListOfDataMembers(bool) (in /opt/root/lib/libCore.so)
==586986==    by 0xE61F109: TClass::GetDataMember(char const*) const (in /opt/root/lib/libCore.so)
==586986==    by 0xE6228C1: TBuildRealData::Inspect(TClass*, char const*, char const*, void const*, bool) (in /opt/root/lib/libCore.so)
==586986==    by 0x131ADF49: TCling::InspectMembers(TMemberInspector&, void const*, TClass const*, bool) (in /opt/root/lib/libCling.so)
==586986==    by 0xE612C34: TClass::CallShowMembers(void const*, TMemberInspector&, bool) const (in /opt/root/lib/libCore.so)
==586986==    by 0xE620F5C: TClass::BuildRealData(void*, bool) (in /opt/root/lib/libCore.so)
==586986==    by 0xDC5BF1C: TBufferFile::WriteClassBuffer(TClass const*, void*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDCF6FCF: TKey::TKey(TObject const*, char const*, int, TDirectory*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDCB9CE8: TFile::CreateKey(TDirectory*, TObject const*, char const*, int) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDCAB41B: TDirectoryFile::WriteTObject(TObject const*, char const*, char const*, int) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDCBB64C: TFile::WriteProcessID(TProcessID*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xE55D38C: TObject::Streamer(TBuffer&) (in /opt/root/lib/libCore.so)
==586986==    by 0xDEC678E: int TStreamerInfo::WriteBufferAux<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDD1E1A3: TStreamerInfoActions::GenericWriteAction(TBuffer&, void*, TStreamerInfoActions::TConfiguration const*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDC5BE11: TBufferFile::WriteClassBuffer(TClass const*, void*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xE615589: TClass::WriteBuffer(TBuffer&, void*, char const*) (in /opt/root/lib/libCore.so)
==586986==    by 0xE654D7C: TStreamerBase::WriteBuffer(TBuffer&, char*) (in /opt/root/lib/libCore.so)
==586986==    by 0xDEC8C94: int TStreamerInfo::WriteBufferAux<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDD1E1A3: TStreamerInfoActions::GenericWriteAction(TBuffer&, void*, TStreamerInfoActions::TConfiguration const*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDC5BE11: TBufferFile::WriteClassBuffer(TClass const*, void*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDC5B9E0: TBufferFile::WriteObjectClass(void const*, TClass const*, bool) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDC627CE: TBufferIO::WriteObjectAny(void const*, TClass const*, bool) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDCA7E41: TEmulatedCollectionProxy::WriteItems(int, TBuffer&) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDCA8236: TEmulatedCollectionProxy::Streamer(TBuffer&) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDCA2114: TCollectionStreamer::Streamer(TBuffer&, void*, int, TClass*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDC55B6A: TBufferFile::WriteFastArray(void*, TClass const*, int, TMemberStreamer*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDEC83FA: int TStreamerInfo::WriteBufferAux<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDD1E1A3: TStreamerInfoActions::GenericWriteAction(TBuffer&, void*, TStreamerInfoActions::TConfiguration const*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xDC555C4: TBufferFile::ApplySequence(TStreamerInfoActions::TActionSequence const&, void*) (in /opt/root/lib/libRIO.so)
==586986==    by 0xCCF14C9: TBranch::FillImpl(ROOT::Internal::TBranchIMTHelper*) [clone .part.0] (in /opt/root/lib/libTree.so)
==586986==    by 0xCCFEFDA: TBranchElement::FillImpl(ROOT::Internal::TBranchIMTHelper*) (in /opt/root/lib/libTree.so)
==586986==    by 0xCD6AF65: TTree::Fill() (in /opt/root/lib/libTree.so)
==586986==    by 0x4BA5B64: allpix::ROOTObjectWriterModule::run(allpix::Event*) (ROOTObjectWriterModule.cpp:204)
==586986== 
==586986== LEAK SUMMARY:
==586986==    definitely lost: 64,839 bytes in 552 blocks
==586986==    indirectly lost: 1,474,491 bytes in 4,873 blocks
==586986==      possibly lost: 884,695 bytes in 278 blocks
==586986==    still reachable: 41,487,520 bytes in 141,045 blocks
==586986==                       of which reachable via heuristic:
==586986==                         newarray           : 30,576 bytes in 62 blocks
==586986==         suppressed: 1,397,993 bytes in 9,101 blocks
==586986== Reachable blocks (those to which a pointer was found) are not shown.
==586986== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==586986== 
==586986== For lists of detected and suppressed errors, rerun with: -s
==586986== ERROR SUMMARY: 235 errors from 235 contexts (suppressed: 7388606 from 1581)

The only thing I recently changed in that code was adding a nested class and setting #pragma link C++ nestedclasses; - can that be related?
This class has transient members, in case that is of relevance:

    class Object : public TObject {
        ClassDefOverride(Object, 3);
    public:
        template <class T> class BaseWrapper {
        public:
            BaseWrapper() = default;
            virtual T* get() const = 0;
            void store() { ref_ = get(); }

            ClassDef(BaseWrapper, 1);

        protected:
            virtual ~BaseWrapper() = default;

            mutable T* ptr_{};           //! transient value
            mutable bool loaded_{false}; //! transient value
            TRef ref_{};
        };
    };

(a bit abbreviated)

I’ll try to git bisect but there have been lots of changes since, so not sure how lucky I’ll be.

/Simon

That one should not be a real problem as those operations are one time per class and the allocated memory is persistent (for the length of the process) and necessary for full functioning; it just so happen that there are not explicitly freed at the end of the process (in part because, beside the kind of investigation you are doing, they are harmless and freeing those in the right order is not always trivial).

so it is more memory hoarding than anything else and should not grow when you increase the length/run-time of your job.