Writing to file

Hi experts,
I am writing a tree to file, with 53 branches, each 32 bit (either int or float). The final tree has ~10 million events. When I write this to file it is extremely slow. I have not encountered this problem before, although I have written large trees to file. And I have 64 GB of RAM available.

If I lower the number of events, or lower the number of branches, then the write goes very quickly.

But my understanding of ROOTs memory usage is that a tree this size should be no problem. Am I correct? At what size should I start seeing real slow-downs in ROOT writing to file? Are there any obvious means of speeding up the write to file?

The only other possibility is that this might be due to the use of PROOF. (I created the tree as a proof session output). But that’s another forum, so I want to cover basics first.

Many thanks!
Mike

My code, in case it helps is

   TFile *f = TFile::Open(outFileName, "RECREATE");
   cout << endl << "--->opened the file" << endl;
   TTree *outTree=(TTree*)proof->GetOutputList()->FindObject("outTree");
  if (outTree) {
    cout << "--->successfully made the bush." << endl;
    cout << "--->it has # of entries = " << outTree->GetEntries() << endl;
    f->cd();
    outTree->Write();
    cout<<endl << "wrote the tree"<<endl;
   }

And the output is

--->opened the file
--->successfully made the bush.
--->it has # of entries = 6867428

But then it hangs forever unless I back off the number of entries or branches.

Which ROOT version are you using ? on which machine ?

Dear Mike,

Can you check the memory usage of the hanging process?
Anyhow, creating a TTree with PROOF is better done via file: see root.cern.ch/handling-outputs .
If the TTree is the only object in output you maybe interested in the “of=<outputfile.root>” option; see root.cern.ch/handling-outputs#outputfile , which sets up everything automatically.

G Ganis

Dear couet,
I should have listed that. My apologies. It is 5.34.28, on a CentOS 6.7 machine. But anyway it is clear that the issue is with PROOF outputs being handled differently. I am still climbing that learning curve.

Dear ganis,
Thanks for the help. I am looking into the methods you suggest. Trying to understand the examples.

Thanks again,
Mike

Dear ganis,
I made a solution using the TFile approach. It all works wonderfully–the tree no longer hangs. The only thing that concerns me is that some of the time I get an error–even though the output is still produced. I get

 +++ Starting PROOF-Lite with 20 workers +++
Opening connections to workers: OK (20 workers)                 
Setting up worker servers: OK (20 workers)                 
PROOF set to parallel mode (20 workers)
08:48:32 51959 Wrk-0.1 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51965 Wrk-0.4 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51979 Wrk-0.11 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51967 Wrk-0.5 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51981 Wrk-0.12 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51973 Wrk-0.8 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51969 Wrk-0.6 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51995 Wrk-0.19 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51977 Wrk-0.10 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51985 Wrk-0.14 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51993 Wrk-0.18 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:32 51961 Wrk-0.2 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:33 51971 Wrk-0.7 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:33 51991 Wrk-0.17 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:33 51957 Wrk-0.0 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:33 51989 Wrk-0.16 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:33 51963 Wrk-0.3 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:33 51983 Wrk-0.13 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:33 51987 Wrk-0.15 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
08:48:33 51975 Wrk-0.9 | Info in <TProofServLite::HandleCache>: loading macro selector.so ...
***class instance created*** 
 
Info in <TProofLite::SetQueryRunning>: starting query: 1
Info in <TProofQueryResult::SetRunning>: nwrks: 20
-->Client beginning...
Info in <selector::Begin>: args: mode: meas
Info in <selector::Begin>: args: channel: pi2e
Looking up for exact location of files: OK (5397 files)                 
Info in <TPacketizer::TPacketizer>: Initial number of workers: 20
Validating files: OK (5397 files)                 
[TProof:] Total 148612163 eventsworkers|====================| 100.00 % [304362.8 evts/s, 82.0 MB/s, time left: 0.0 s]]
 Query processing time: 488.3 s
Info in <TFile::GetStreamerInfoList>: cannot find the StreamerInfo record in file /nv/blue/mgv4ce/.proof/sfs-lustre-scratch-mgv4ce-selector/session-udc-ba35-37-1466772510-51859/worker-0.2//output.root
Info in <TFile::GetStreamerInfoList>: cannot find the StreamerInfo record in file /nv/blue/mgv4ce/.proof/sfs-lustre-scratch-mgv4ce-selector/session-udc-ba35-37-1466772510-51859/worker-0.19//output.root
Info in <TFile::GetStreamerInfoList>: cannot find the StreamerInfo record in file /nv/blue/mgv4ce/.proof/sfs-lustre-scratch-mgv4ce-selector/session-udc-ba35-37-1466772510-51859/worker-0.15//output.root
Info in <TFile::GetStreamerInfoList>: cannot find the StreamerInfo record in file /nv/blue/mgv4ce/.proof/sfs-lustre-scratch-mgv4ce-selector/session-udc-ba35-37-1466772510-51859/worker-0.12//output.root

Output file: output.root
-->Terminating... 
outputFile: output.root
Managed to open file: output.root

writing the tree with # entries = 6032041
Lite-0: all output objects have been merged 

As I said, the output is still produced, but the errors make me nervous since I don’t know whether they indicate a larger problem.

In slave begin I do

void selector::SlaveBegin(TTree * /*tree*/)
{
  UInt_t opt = TProofOutputFile::kRegister | TProofOutputFile::kOverwrite | TProofOutputFile::kVerify;
  TNamed *out = (TNamed *) fInput->FindObject("PROOF_OUTPUTFILE_LOCATION");
  Info("SlaveBegin", "PROOF_OUTPUTFILE_LOCATION: %s", (out ? out->GetTitle() : "undef"));
  fProofFile = new TProofOutputFile("output.root", (out ? out->GetTitle() : "M"));
  out = (TNamed *) fInput->FindObject("PROOF_OUTPUTFILE");
  if (out) fProofFile->SetOutputFileName(out->GetTitle());

  fFile = fProofFile->OpenFile("RECREATE");
  if (fFile && fFile->IsZombie()) SafeDelete(fFile);
  // Cannot continue
  if (!fFile) {
      Info("SlaveBegin", "could not create '%s': instance is invalid!", fProofFile->GetName());
      return;
  }

  outTree = new TTree("outTree","outTree");
  outTree->Branch("e_deg",           &e_deg);
  //add all the branches
  
  outTree->SetDirectory(fFile);
  outTree->AutoSave();
}

And in slave terminate I do

void selector::SlaveTerminate()
{
  if (!fFile) cout << endl << "The file was not detected by SlaveTerminate"<<endl;
  if (fFile) {
  cout <<"name of ffile is: " << fFile->GetName() << endl ;
  if (!outTree){
    Error("SlaveTerminate", "'outTree' is undefined!");
         return;
  }
      Bool_t cleanup = kFALSE;
      TDirectory *savedir = gDirectory;
      if (outTree->GetEntries() > 0) {
         fFile->cd();
         outTree->Write(0, TObject::kOverwrite);
         fProofFile->Print();
         fOutput->Add(fProofFile);
      }
   }

}

And in terminate I do:

void selector::Terminate()
{
  cout << "-->Terminating... " << endl;

if ((fProofFile =
           dynamic_cast<TProofOutputFile*>(fOutput->FindObject("output.root")))) {

      TString outputFile(fProofFile->GetOutputFileName());
      TString outputName(fProofFile->GetName());
      Printf("outputFile: %s", outputFile.Data());

     // Read the tree from the file
     fFile = TFile::Open(outputFile);
     if (fFile) {
     Printf("Managed to open file: %s", outputFile.Data());
     outTree = (TTree *) fFile->Get("outTree");
     cout << endl << "writing the tree with # entries = " << outTree->GetEntries() << endl;
     } else {
     Error("Terminate", "could not open file: %s", outputFile.Data());
     }
     if (!fFile) return;

      } else {
      Error("Terminate", "TProofOutputFile not found");
      return;
      }