Random errors in TFile Read/Write

Hi Guys,

I’m having this strange error reading\writing TTrees…

I work on lxplus interactively, cloning ttrees from Tfiles, filtering and adding branches, then saving to other Tfiles…

I sometimes get the following errors:

  1. Error in TFile::ReadBuffer: error reading all requested bytes from file , got 2439 of 5769 (When trying to read TREX final files)

  2. SysError in TFile::WriteBuffer: error writing to file <file_name> (-1) (Remote I/O error)
    Error in TBranchElement::TBranch::WriteBasketImpl: basket’s WriteBuffer failed.

Error in TBranchElement::TBranch::Fill: Failed to write out basket.

Error in TBranchElement::Fill: Failed filling branch:tau_0_charged_tracks_z0_sintheta_tjva, nbytes=-1
Error in TTree::Fill: Failed filling branch:NOMINAL.tau_0_charged_tracks_z0_sintheta_tjva, nbytes=-1, entry=2209356
This error is symptomatic of a Tree created as a memory-resident Tree
Instead of doing:
TTree *T = new TTree(…)
TFile *f = new TFile(…)
you should do:
TFile *f = new TFile(…)
TTree *T = new TTree(…)
(and repeating for many branches)

  1. SysError in TFile::Flush: error flushing file

I run on many files, and those errors seem random…always on a different file, having a different error. Usually just one file/tree in the run exhibit this error (out of ~200 files), while the others are processed fine…

When re-running the same script on the problematic file, there is no error and everything runs smoothly…
I got two different codes processing those files, and the problem exist in both.

All files are located in eos. The input files are used by many people so there shouldn’t be a problem there. The output files are recreated each time I run.

What is happening? I’m guessing something is unstable and I’m not sure what…Is it the files? why?

Will much appreciate your help!

My code is as follows:

ifstream fs;
fs.open(args.infilelist);
TString infile;
//loop on input Tfiles path (read from file):
while(fs >> infile){

   //this function opens the input file, reads the ttree and returns a pointer to it
   TTree *tree = initialize_tree(infile, h_metadata, h_metadata_theory_weights, args.ismc);
   //then the output file is created:
   TFile f(file_name,"recreate");
   auto t1 = tree->CloneTree(0);
   t1->Branch("isVBF",&isVBF);
   <and adding more branches>

   Long64_t nEvents = tree->GetEntries();
   for (Long64_t iEvent = 0; iEvent < nEvents; iEvent++) {
      tree->GetEntry(tree->GetEntryNumber(iEvent));
       <filtering and calculating branches variables>
       t1->Fill();
    }

    t1->Write();
    f.Close();
}

fs.clear();
fs.seekg(0,ios::beg);
fs.close();

_ROOT Version: 6.18/04
_Platform: lxplus

Hi @Galsel ,
sorry for the late reply, things are a bit slow due to the holidays.
I think we need @pcanal 's help here, let’s ping him.

Cheers,
Enrico

In your snippet the lifetime of tree and its associated file is unclear (If I understood correctly, the input file ought to be closed only after the for loop). It is also unclear whether filename is updated at each iteration of the while loop (if not, then the same write is over-written many times).

If the lifetime of the input and the filename of the output are all correct, then the next likely cause is instability of the EOS connection. To test this hypothesis, copy the file locally (use those) and write the output locally.

If you still see the problems then, try running the failing example under valgrind (valgrind --suppressions=$ROOTSYS/etc/valgrind-root.supp your_command your_arguments) and the output may contains some clue of what is going on.

Cheers,
Philippe.

1 Like

Thank you guys!

The filename is updated, but I now notice the input file is not closed at all…though I’m not sure if that can affect the output file…
In any case, I’ll correct this and try the other suggestions if it won’t work. Thanks a lot!

Best,
Gal

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.