Frequent failure to update ROOTfiles at /eos/

Dear Experts,

In my analysis I often need to update large set of ROOT files (~500-1000 files) that are placed at /eos/… Typically, each time I add new branches to the TTrees in these files.
Unfortunately very often ROOT fails to update these files. The problem is not easy to reproduce.
While individual failure rate (per-file) is not 100%, the combined failure rate (per set of files) often reaches 100%. However sometime there is no failures.

When I keep all these files on afs, there are no failures at all. Therefore I have strong suspicion that problem is somehow related to /eos/…

On my machine eos is mounted via FUSE.
I’ve been adviced to use the file access via the explicit protocol
“root://…” but it from th efirst sight it even increase the failure rate.

In my “main” setup I relay on ROOT builds from cvmfs LCG dev3 nightly slot,
but the problems persists also for LCG_103 slot
I’ve also tried to use several different versions of ROOT, and I have some small evidence that the
failure rate is somehow smaller for the ROOT version from the LCG_101 slot

Surely for a time being I can keep the files on afs, where no problems occurs,
but my afs quota is not unlimited and I am already close to the limits.
Also CERN IT discourages the usage of afs

Is there recommended solution for this problem?
(For me it is not clear if it is ROOT or EOS problem.
From the figtj sight it loos like ES problem, but I’ve looked into
code of TFile::Open and I see the code explicitely performs some EOS-specific actions.

I think @Axel and @jblomer are aware of the problem and they can give some details

This is EOS, you’re not the first to notice :slight_smile:

The way to go is to create a stand-alone reproducer (maybe even without ROOT, just updating a file?) and hand that over to our EOS friends through Login - CERN Service Portal: easy access to services at CERN

I’d be very interested to hear whether you manage to reproduce this and what the ticket is :slight_smile:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.