Error reading TFile

Dear experts,

I have been experiencing some issues with reading TFiles using both C++ and PyROOT macros, as well as interactive root sessions. When trying to read some existing file, it generates a error as in the following:

> [avroy@lxplus731 Limits]$ root -l /eos/user/a/avroy/Analysis_Outputs/ALL_OUTPUTS2//MYOUTPUT.sig.M13K090.root
> root [0] 
> Attaching file /eos/user/a/avroy/Analysis_Outputs/ALL_OUTPUTS2//MYOUTPUT.sig.M13K090.root as _file0...
> Error in <TFile::ReadBuffer>: error reading all requested bytes from file /eos/user/a/avroy/Analysis_Outputs/ALL_OUTPUTS2//MYOUTPUT.sig.M13K090.root, got 63 of 300
> Error in <TFile::Init>: /eos/user/a/avroy/Analysis_Outputs/ALL_OUTPUTS2//MYOUTPUT.sig.M13K090.root failed to read the file type data.
> (TFile *) nullptr
> root [1] .q

The worse part of this error is that it actually corrupts the file. Doing a ls -lh gives the following output:

> [avroy@lxplus731 Limits]$ ls -lh /eos/user/a/avroy/Analysis_Outputs/ALL_OUTPUTS2//MYOUTPUT.sig.M13K090.root
> -rw-r--r--. 1 avroy zp 63 Aug  7 16:43 /eos/user/a/avroy/Analysis_Outputs/ALL_OUTPUTS2//MYOUTPUT.sig.M13K090.root

As you can see, the size of the file has been reduced to 63 bytes, instead of some ~1MB. Each of these files contains some 1k histograms, created by merging 3 similar TFiles. All such files are stored within a single directory, containing all necessary TFiles for my analysis. That one directory stores ~65k different files.

I am also bothered by a similar but maybe uncorrelated error, in which ROOT fails to read certain files in a given lxplus session, returns null pointers. But the file is there, the size being what it is supposed to be, and perfectly readable from a different lxplus session.

It will be great to have suggestions on how to circumvent these issues.

ROOT Version: 6.18/04
Built for linuxx8664gcc on Sep 11 2019, 15:38:23
From tags/v6-18-04@v6-18-04

kind regards,
Avik

Hi Avik,

The worse part of this error is that it actually corrupts the file.

When opening the file in “READ” mode (the default that you are using) the Operating System should not allow the process to modify the file (we go through the regular posix fopen and co).

This seems to indicate a problem or instability in the file system itself (i.e. EOS). Such an instability would explain all the symptoms you describe.

To verify this, you could copy the file to a local disk and try opening there (after verifying, maybe via md5sum, that it was correctly copied).

Cheers,
Philippe.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.