Dear all,
we (site admins of a Tier-2/3 cluster site) are facing an issue in conjunction with opening many ROOT files on our local dCache system. The symptom is, that the users manage to use so many movers on dCache, that they hit the limit and essentially freeze everything.
Their code opens a lot of files, but, as far as we could check, also closes them properly and deletes the TFile reference.
While trying to understand the problem, we tried the following (in the interpreter so far only):
root [1] TFile *a=TFile::Open(“dcap://grid-se.physik.uni-wuppertal.de/pnfs/physik.uni-wuppertal.de/data/atlas/atlasdatadisk/rucio/data18_13TeV/8b/2f/data18_13TeV.00349263.physics_Main.merge.AOD.f937_m1972._lb0159._0001.1”);
This opens (as expected) a socket connection between the machine and our dCache:
[root@top ~]# netstat --tcp -p | grep dcap
tcp 0 0 top.pleiades.uni-:55720 grid-se.physik.uni:dcap ESTABLISHED 12730/root.exe
But now: even after closing the deleting the TFile reference:
root [3] a->Close();
root [4] delete a;
the socket stays open (and most likely eats a mover on dCache).
We googled already quite a lot but hints like
gROOT->GetListOfFiles()->Remove(a);
did not make any difference.
We would appreciate hints if that is a “feature” of the interpreter only or if not how to ensure that closed files do not keep a connection established to dCache.
Thanks a lot
Torsten (for the Wuppertal site admins)
ROOT Version: 6.18/04
Platform: x86_64 CentOS 7
Compiler: linuxx8664gcc