Hi Axel, Chandi,
I probably won’t have time to look into it before next week Tuesday (when I’ll be at CERN for two weeks).
That said, I do have a couple of comments from a quick inspection:
-
Something is leaking file destructors. That much is clear.
-
I thought I had that debugged and checked when I wrote the class, but I’m willing to look again, because I’m good at making mistakes. 
A quick inspection of my code did not reveal anything (the close and shutdown are where I remembered I put them), but to be sure, I’ll have to trace it, and check that BidirMMapPipe isn’t leaking file descriptors, which requires time…
I also remember testing the code with 1024 open pipes when I wrote it (i.e. 2048 file descriptors), and did not run into trouble on the file descriptor front. (I could not go higher because I’d run out of process table entries with so many forked off child processes at the time…)
-
A general remark about resource leaks: Just because BidirMMapPipe throws the exception does not mean that BidirMMapPipe is necessarily the offender that’s leaking the resource in question. It’s similar to a memory leak: When you have a memory leak, the out-of-memory condition does not neccessarily hit in the code leaking memory, it can hit any code allocating memory. It literally can hit anywhere, and memory allocations are frequent. For file descriptors, the story is very much the same. Just because BidirMMapPipe is being told by the OS that we’ve run out of file descriptors, doesn’t mean that BidirMMapPipe is leaking them.
-
Have you checked that you make RooFit give up its resources at the end of the loop, and that you close all files you open in that loop? I’ve had a quick look through your massfit.cc file, and I see plenty of pointers, with no clear concept of object ownership, and virtually no explicit or implicit cleanup.
My guess would be that EffFile is not closed properly (opened on line 122 or so)…
(Or it could be that you or RooFit leaks a RooFit object that holds a BidirMMapPipe internally - then we’d be in much the same situation, without BidirMMapPipe being responsible.)
[Okay, rant of a fellow leak hunter begins - don’t take it personally, it’s gotten a bit longish because I’ve been in similar situations myself, and have a bit of experience on just how frustrating resource leaks can be… Maybe there’s something useful in there for you…]
My first order of business is usually to check my code to see that every new is paired with a delete, and every TFile::Open with a Close.
Also, one needs to worry about methods returning pointers (or taking pointers as arguments), since that can mean obtaining or transferring object ownership, and that means the responsibility to free things when you’re done can move from ROOT to you (or from your code to ROOT if you transfer ownership)…
I know it’s a pain, especially since the documentation is usually silent about the ownership transfer. I usually have to inspect the sources of the method I’m calling to know when I get or transfer object ownership. I’m not just saying that to make the bug report go away…
I’ve spent months getting complex fits to not run out of memory, so I know just how frustrating that experience can be. The trouble is that much of the code in ROOT and RooFit was written before things like shared_ptr or unique_ptr were invented (and RAII still seems to be undervalued and misunderstood by large parts of HEP users), so resource management is a pain.
[Rant ends - sorry for the noise…
]
If EffFile above is not the offender, please let me know, and I’ll have a detailed look in BidirMMapPipe next week. But I suspect that your problem stems from the general “leakiness” of the code inside your loop.
Cheers, and let me know how this turns out,
Manuel