Persistent Memory Growth when using cling::Interpreter in a Request Loop

zhu_le · May 1, 2025, 9:22am

I’m encountering a significant issue with memory usage when using cling::Interpreter within a C++ HTTP server. It appears that memory is not being fully released after each use of the interpreter, leading to a steady increase in the process’s Resident Set Size (RSS). This is a critical problem for my application, especially under load and with more complex code execution.

Problem Description:

I’m building a C++ HTTP server where each incoming request requires the execution of dynamic C++ code using a new instance of cling::Interpreter. I’m observing a continuous growth in memory usage (specifically RSS) with each request processed. While there’s some slight memory reduction after a few minutes, it never returns to the baseline, and the overall trend is upward.

The memory growth is particularly severe when more header files are included in the interpreter session, potentially leading to gigabytes of increased memory usage and eventual exhaustion in my production environment.

Observed Behavior:

After processing multiple requests, the RSS memory consistently increases. Here’s a sample of the RSS output I’m seeing:

Current process RSS: 62959616 bytes (61484 KB, 60.043 MB)
Current process RSS: 69967872 bytes (68328 KB, 66.7266 MB)
Current process RSS: 76976128 bytes (75172 KB, 73.4102 MB)
Current process RSS: 83988480 bytes (82020 KB, 80.0977 MB)
Current process RSS: 90992640 bytes (88860 KB, 86.7773 MB)
Current process RSS: 97996800 bytes (95700 KB, 93.457 MB)
Current process RSS: 105005056 bytes (102544 KB, 100.141 MB)

Each request seems to add approximately 7MB in this simplified example. In my real application with more headers, this growth is much larger. Memory is only recovered slightly and after a significant delay (1-10 minutes in the simple case, much longer in the complex case).

Re-using a single cling::Interpreter instance across requests does fix the memory growth, but this is not feasible for my use case due to potential naming conflicts with variables and definitions between different dynamic code snippets executed for different requests.

Environment:

Operating System: ubuntu:22.04
Cling Version: cling --version reports 1.3~dev

Simplified Code Example:

Below is a simplified version of my code that demonstrates the issue. The server listens for POST requests, and for each request, it creates a new cling::Interpreter instance, includes <vector>, and then attempts cleanup.

#include <cling/Interpreter/Interpreter.h>
#include <cling/Interpreter/Value.h>
#include "httplib.h"
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <unistd.h>

long getResidentSetSize() {
    std::ifstream statm_file("/proc/self/statm");
    long rss = -1;
    if (statm_file.is_open()) {
        long size, resident, shared, text, lib, data, dirty;
        if (statm_file >> size >> resident >> shared >> text >> lib >> data >> dirty) {
            rss = resident; // resident set size in pages
        }
        statm_file.close();
    }
    return rss;
}

int main(int argc, const char* const* argv) {
    httplib::Server svr;
    svr.Post("/", [&](const httplib::Request& req, httplib::Response& res) {
        {
            cling::Interpreter interp(argc, argv, LLVMDIR);
            interp.declare("#include <vector>");
            interp.runAndRemoveStaticDestructors();
            interp.unload(0);
            res.set_header("Connection", "close");
            res.set_content("over", "text/plain");
        }
        long rss_pages = getResidentSetSize();
        if (rss_pages != -1) {
            long page_size = sysconf(_SC_PAGESIZE);
            long rss_bytes = rss_pages * page_size;
            std::cout << "Current process RSS: " << rss_bytes << " bytes ("
                      << rss_bytes / 1024.0 << " KB, "
                      << rss_bytes / 1024.0 / 1024.0 << " MB)" << std::endl;
        } else {
            std::cerr << "Could not read /proc/self/statm" << std::endl;
        }
    });
    std::cout << "Server listening on http://0.0.0.0:3000" << std::endl;
    if (!svr.listen("0.0.0.0", 3000)) {
        std::cerr << "Error starting server!" << std::endl;
        return 1;
    }
    return 0;
}

Attempted Solutions:

I have tried the following steps to mitigate the memory growth, but none have fully resolved the issue:

Ensuring the cling::Interpreter instance is created within a local scope so it is destroyed after each request.
Calling interp.runAndRemoveStaticDestructors() before the interpreter is destroyed.
Calling interp.unload(0) to attempt to unload resources associated with the interpreted code.
Calling malloc_trim(0) (although not shown in the simplified example, I’ve tried this in my actual code) - this did not have a significant effect on the RSS growth.
Compiling with -g -fsanitize=leak (via CMake: add_compile_options(-g -fsanitize=leak) and add_link_options(-fsanitize=leak)). No leaks were reported by the sanitizer after running the program.

Despite these efforts, the memory footprint continues to grow with each request, which is unsustainable in a production environment.

Valgrind Output:
I ran the application under Valgrind and terminated it with Ctrl+C (SIGINT). Here is the output:

root@6c68e8f2ebcf:/app/valgrind# valgrind ../build/cling-server
==4716== Memcheck, a memory error detector
==4716== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4716== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==4716== Command: ../build/cling-server
==4716==
Server listening on http://0.0.0.0:3000
Current process RSS: 339873792 bytes (331908 KB, 324.129 MB)
Current process RSS: 367759360 bytes (359140 KB, 350.723 MB)
Current process RSS: 376025088 bytes (367212 KB, 358.605 MB)
Current process RSS: 393347072 bytes (384128 KB, 375.125 MB)
Current process RSS: 408956928 bytes (399372 KB, 390.012 MB)
Current process RSS: 413200384 bytes (403516 KB, 394.059 MB)
Current process RSS: 418865152 bytes (409048 KB, 399.461 MB)
^C==4716==
==4716== Process terminating with default action of signal 2 (SIGINT)
==4716== at 0x7ACDD30: accept4 (accept4.c:31)
==4716== by 0xE18446: httplib::Server::listen_internal() (httplib.h:6991)
==4716== by 0xE14B1D: httplib::Server::listen(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, int) (httplib.h:6552)
==4716== by 0xE05F41: main (cling.cpp:46)
==4716==
==4716== HEAP SUMMARY:
==4716== in use at exit: 35,428,755 bytes in 20,904 blocks
==4716== total heap usage: 127,886 allocs, 106,982 frees, 112,815,693 bytes allocated
==4716==
==4716== LEAK SUMMARY:
==4716== definitely lost: 56 bytes in 7 blocks
==4716== indirectly lost: 0 bytes in 0 blocks
==4716== possibly lost: 5,026,378 bytes in 2,702 blocks
==4716== still reachable: 30,402,321 bytes in 18,195 blocks
==4716== of which reachable via heuristic:
==4716== multipleinheritance: 180,224 bytes in 16 blocks
==4716== suppressed: 0 bytes in 0 blocks
==4716== Rerun with --leak-check=full to see details of leaked memory
==4716==
==4716== For lists of detected and suppressed errors, rerun with: -s
==4716== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Questions:

Are there additional methods within the cling library or related to LLVM that can force a more aggressive and immediate release of memory and resources associated with an interpreter instance after it’s no longer needed?
Is there anything fundamentally incorrect or missing in my implementation regarding the cleanup of cling::Interpreter or the resources it allocates (like JIT’d code, parsed ASTs, etc.) when being used in this kind of request-loop pattern?

Any insights or suggestions from experienced cling/ROOT users would be greatly appreciated. I’m relatively new to C++ and libraries like cling, so I might be overlooking something obvious.

Thank you for your time and help!

StephanH · May 2, 2025, 9:14am

Hello,

I will call on @vvassilev to have an expert look.

What I can tell you in the mean time is that due to the compiler being “abused” a bit, there are some places where we cannot give back the memory so easily (because the compiler was meant to exit after compiling one source file, not run as a server application).
For the rest, let’s see if @vvassilev knows a way how to improve your situation.

vvassilev · May 2, 2025, 10:58am

@StephanH is correct. It is not trivial to reclaim memory.

One workaround that you could try is to move your most common header files in a PCH and attach that at interpreter startup. That would mmap the headers and only pull things into memory when they are needed.

zhu_le · May 2, 2025, 11:03am

Thank you for the context and for looping in @vvassilev — I appreciate the support and am happy to provide more details if helpful!