Read a tfile while writing with another process

Hi,

I like to use the new THttpServer for online monitoring of a detector setup. I wonder if it is save to open a TFile with the “READ” option in another process, if it still a part of a DAQ writing process.

I could imagine some crash scenarios if the read process accessing an event during the write process.
I accidently opened some TFiles during the write and nothing bad happens, but I do not want to access them continuously if there could be cases that corrupt the data file

Thomas

[superseeded by the complete comment of Rene!]

Just to contradict Danilo, it is safe to open a ROOT file being written by another process (on the same or another machine).
In case the written process(es) drop some info, the corresponding space is not reused immediatly by
the writing process. The directory info is only rewritten in the file header when the file is closed or flushed.
You have a tiny probability to access the file header in the reading process at the same time that it is being written, but even in this case the Recover function is called for a new pass.
Of course, in the case where you write a TTree, you will not get teh latest fills, but you will get a safe info
up the previous TTree::AutoSave.
I would say that in the case of a THttpserver, you should be able to access the remote file without any problems. There are zillions of people doing that in online or monitoring systems.
I am doing precisely that (in my new job) thousands of times per day ::slight_smile:

Rene

How does one update the TFile on the reading process without Closing and Opening? From the description TFile::Flush sounds like the solution, but this does not update the listing with the new objects added to the file.

First, it seems that the only way to update a TFile is to call TFile::Open again, ReOpen does not do anything if the mode has not changed. If there is a better way, I would love to know.

Second, I have the following example script which adds 100 counts to a histogram then sleeps for 1 second and repeats.

#include "TFile.h"
#include "TH1F.h"

void writer() {
   TFile *f = new TFile("test.root","RECREATE");
   TH1F *h = new TH1F("h","h",100,-2,2);
   f->Write(0,TObject::kOverwrite);
   int count = 0;
   while(1) {
      printf("Loop %d\r",count++);
      fflush(stdout);
      h->FillRandom("gaus",100);
      f->Write(0,TObject::kOverwrite);
      f->Flush();
      sleep(1);
   }
}

If I then try to open test.root in another process while the writer is still working and draw the histogram contents I get the following errors:

$ root test.root 
root [0] 
Attaching file test.root as _file0...
root [1] h->Draw()
R__unzip: error -5 in inflate (zlib)
R__unzip: error -5 in inflate (zlib)
Error: Symbol h is not defined in current scope  (tmpfile):1:
Error: Failed to evaluate h->Draw()
*** Interpreter error recovered ***

And

root [2] TFile *_file0 = TFile::Open("test.root")
root [3] h->Draw()
R__unzip: error -3 in inflate (zlib)
R__unzip: error -3 in inflate (zlib)
Error: Symbol h is not defined in current scope  (tmpfile):1:
Error: Failed to evaluate h->Draw()
*** Interpreter error recovered ***

Does this mean that compression can not be used in this situation? Any improvements in the above example are appreciated.

Hi,

The use of the non-default TObject::kOverwrite is fatal in this case. TObject::kOverwrite is an explicit to first remove the previous copy and then write the new copy (lightly in place of the hold one) into the file. This significantly increase the risk that the writing happens while the reader in not yet done reading.

If you want the write the object safely while still disabling the keeping of cycle (backup copy) use TObject::kWriteDelete (which write then deletes).

You can also use SaveSelf rather than Flush but then need to explicitly store each historgram.

#include "TFile.h"
#include "TH1F.h"

void writer() {
   TFile *f = new TFile("test.root","RECREATE");
   TH1F *h = new TH1F("h","h",100,-2,2);
   f->Write(0,TObject::kReadWrite);
   int count = 0;
   while(1) {
      printf("Loop %d\r",count++);
      fflush(stdout);
      h->FillRandom("gaus",100);
      h->Write(0,TObject::kReadWrite);
      f->SaveSelf();
      sleep(1);
   }
}

To refresh the information on the reader side, use ReadKeys:

void reading(const TFile *f, const char *name="gaus") {
   f->ReadKeys();
   delete f->FindObject(name);
   TH1F *h;  f->GetObject(name,h);
   h->Draw();
}

Cheers,
Philippe.

Thanks the SaveSelf and ReadKeys methods are new to me and look promising. Although, it seems quite awkward that after performing ReadKeys that one must delete the histogram object and then get it again.

At the moment the reading process is simply on the interpreter and I was able to update the histogram by simply calling TFile::Open again as shown below. What is the downside to this over ReadKeys?

root [0] TFile *_file0 = TFile::Open("test.root")
root [1] h->Draw()
root [2] TFile *_file0 = TFile::Open("test.root")
root [3] h->Draw() //Draws updated histogram

ReadKeys read only the keys meta data, not the object pointed by the keys.
By calling ReadKeys you ensure that you get the latest valid representation and pointers inside the file.

Rene

What is the downside to this over ReadKeys?

Recreate the TFile object does everything you need (except you also need to delete the previous one) but also does more than you really need. In other word, doing multiple TFile::Open is simpler but a bit slower.

Cheers,
Philippe.