Read histogram from TFile while actively writing to it (with example)

Context

I have software that processes data from a data file. The software pre-defines histograms. Users can define their own trees and histograms to write to disk. The software asynchronously writes the data to disk. This means that while data is writing to disk histograms are still being filled. I have confirmed that this doesn’t create any issues as all data is written to disk as expected.

Problem

Now the tricky part. I’ve been trying to create a histogram / tree viewer to go along with the software. I’ve followed the advice listed here. But keep ending up with

  • Error R__unzip_header: error in header. Values :
  • Segmentation faults
  • R__unzip: error -X in inflate (zlib) (where X is either 3 or 5)

While the advice that “zillions” of people are doing this means there’s a solution. I’m at a loss to find any concrete examples of where this is handled.

Script

#include <TFile.h>
#include <TH1.h>

#include <iostream>

void viewOnline(TFile *file, const char *name) {
    TH1I *hist1 = nullptr;

    while(!hist1) {
        file->ReadKeys();
        delete file->FindObject(name);
        file->GetObject(name, hist1);
        usleep(100000);
    }
    hist1->Draw();
}

Compiled using .L viewOnline.C+ in ROOT and called via viewOnline(_file0, "histogram")

Environment

Component Value
Operating system ubuntu
Kernel Version 4.4.0-43-Microsoft
CMake Version 3.5.1
GCC Version gcc (Ubuntu 5.4.0-6ubuntu1~16.04.6) 5.4.0 20160609
ROOT Version 6.13/01
GSL Version 2.4

Hi,

are you modifying the same file from two different processes (" delete file->FindObject(name);")?

Cheers,
D

That is correct. I am using the method suggested in this forum post. I had to remove the const modifier from the TFile argument in reading because ReadKeys is not marked as const (see error below).

I tried omitting delete file->FindObject(name);, but end up with this error

Error in <TFile::ReadKeys>: reading illegal key, exiting after 1 keys

If I use the method described in this response, then I can view histograms as expected.

I should note that I’ve also read the the threads here and here.

Error when using const TFile

In file included from input_line_12:9:
././viewOnline.C:11:9: error: member function 'ReadKeys' not viable: 'this' argument has type 'const TFile', but function is not marked const
        file->ReadKeys();
        ^~~~
/opt/root/v6-11-02-1447-g38586de/include/TDirectoryFile.h:105:24: note: 'ReadKeys' declared here
   virtual Int_t       ReadKeys(Bool_t forceRead=kTRUE);
                       ^
In file included from input_line_12:9:
././viewOnline.C:13:15: error: no matching member function for call to 'GetObject'
        file->GetObject(name, hist1);
        ~~~~~~^~~~~~~~~
/opt/root/v6-11-02-1447-g38586de/include/TDirectoryFile.h:78:35: note: candidate function not viable: no known conversion from 'const TFile' to 'TDirectoryFile' for object argument
   template <class T> inline void GetObject(const char* namecycle, T*& ptr) // See TDirectory::Get for i...

I modified my script to call TFile::Open(), instead of taking a pointer to a TFile. It’s a modification from this post. According to Philippe, this is not the most efficient way to do this, see here.

Thus far this is the most robust method I’ve found.

#include <TFile.h>
#include <TH1.h>

#include <iostream>

void viewOnline(const std::string &filename, const int &id) {
    const char *name = ("h"+std::to_string(id)).c_str();
    TFile *file = TFile::Open(filename.c_str());
    TH1D *hist1;
    file->GetObject(name, hist1);
    hist1->Draw();
}

Hi,

modifying the same file from 2 different processes is not allowed. Reading a file which is being written is supported (it is there to support real time acquisition systems for example) but at the cost of warnings which may be prompted by the system.

Cheers,
D

Then I’m confused by the suggestion made here. Why was this even suggested as a solution if that process isn’t allowed? What would be the correct way to handle this?

Is there an example of this? I’ve never gotten it to work well enough. Is there an alternative output that is better for real time acquisition?

Hi,

What is supported (and as far as I thought described in the link you used) is writing by a single process into a file and reading by multiples processes. (And for the reader to properly refresh their view on the updated files, they need to call ReadKeys).

Cheers,
Philippe

Just to be clear. I’m only writing to the via a single process, see here. The write command is protected by a mutex.

I am reading from the file by

  1. Opening it in ROOT: root file.root
  2. Loading the script : .L viewOnline.C+;
  3. Attempting to view the histogram: viewOnline(_file0, 1);

If I understand your suggestion properly, then if I simply remove

        delete file->FindObject(name);

From my original script, then it should work? I can confirm that removing this line still results in a segfault when trying to call

        file->GetObject(name, hist1);

The crash is a bit surprising. But nonetheless, I would actually expect the code to be like:

        delete file->FindObject(name);
        file->ReadKeys();

If this still fails, I would run the failing process under valgrind (with the option --suppressions=$ROOTSYS/etc/valgrind-root.supp) to get more information on the why of the crash.

Cheers,
Philippe.

I will check this order when I return home. Do you know of any open source examples of reading online files? I have a feeling I’m trying to reinvent the wheel here.

1 Like

@pcanal : I determined the cause of the segfault was calling Draw() on the TH1I pointer that was still NULL.

I realize now my issue with

        delete file->FindObject(name);
        file->ReadKeys();

These commands only work if you have loaded the ROOT file that is actively being written. This does not work if you have opened an old file of the same name. When the file is RECREATEd,

I now have two working methods of reading a file from disk while a process is actively writing to it. Below I post both the example, and steps to run. I will mark the topic as solved.

Working Example

///@file example-28273.C
///@brief A program that will generate data and asynchronously write it to disk.
///@author S. V. Paulauskas
///@date March 12, 2018
///@copyright Copyright (c) 2018 S. V. Paulauskas.
///@copyright All rights reserved. Released under the Creative Commons Attribution-ShareAlike 4.0 International License
#include <TFile.h>
#include <TH1D.h>

#include <iostream>
#include <mutex>
#include <thread>

void AsyncFlush(TFile *file, std::mutex *lock, unsigned int *loopId) {
    std::cout << "AsyncFlush - Now calling file->Write" << std::endl;
    file->Write(0, TObject::kWriteDelete);
    std::cout << "AsyncFlush - Now unlocking the file for writing in loop " << *loopId << std::endl;
    lock->unlock();
}

void WriteToDisk(TFile *file, unsigned int *loopID, std::mutex *lock) {
    if (lock->try_lock()) {
        std::cout << "WriteToDisk - we're creating the new thread in loop number " << *loopID << endl;
        std::thread worker0(AsyncFlush, file, lock, loopID);
        worker0.detach();
    }
}

void GenerateHistogram() {
    std::mutex lock;
    TFile *f = new TFile("test.root", "RECREATE");
    TH1D *hist = new TH1D("hist", "", 100, -2, 2);
    unsigned int loopCounter = 0;
    while (loopCounter < 40000) {
        hist->FillRandom("gaus", 10000);
        WriteToDisk(f, &loopCounter, &lock);
        loopCounter++;
    }
    while(!lock.try_lock())
        sleep(1);
    f->Write(nullptr, TObject::kWriteDelete);
    f->Close();
}

// Adapted from
// https://root-forum.cern.ch/t/read-a-tfile-while-writing-with-another-process/18608/6
// and follow up comment
// https://root-forum.cern.ch/t/read-histogram-tree-from-tfile-while-actively-writing-to-it/28273/10
void ReadHistogram(TFile *file, const char *name = "hist") {
    TH1D *hist1 = nullptr;
    delete file->FindObject(name);
    file->ReadKeys();
    file->GetObject(name, hist1);
    if(hist1)
       hist1->Draw();
}

// Adapted from
// https://root-forum.cern.ch/t/read-a-tfile-while-writing-with-another-process/18608/7
void ReadHistogram() {
    TFile *file = TFile::Open("test.root");
    TH1D *hist = nullptr;
    file->GetObject("hist", hist);
    if(hist)
        hist->Draw();
}

Instructions for Running Script

Terminal A - The Writer

user@localhost:> root
root [0] .L example.C+;
root [1] GenerateHistogram();

Terminal B - The Reader

user@localhost:> root
root [0] .L example_C.so;
root [1] ReadHistogram();
root [2] ReadHistogram();
root [3] ReadHistogram();
...

Terminal C - The Reader (alternate)

Executed after Writing has started in Terminal A

user@localhost:> root test.root
root [0] .L example_C.so;
root [1] ReadHistogram(_file0);
root [2] ReadHistogram(_file0);
root [3] ReadHistogram(_file0);
...

I am not sure what you mean here … Are those operation in the same process?

@pcanal : They say pictures are worth 1k words. Here’s a drawing of what I mean.

When the file is RECREATEd, I assume the memory address changes.

Right, I get it now. [I think you meant "I assumed that the inode and location of the physical files changes and the TFile object that are looking at the old version are not informed that there is a ‘new physical file’ they should be looking.]

Cheers,
Philippe.

PS. What confused me is that memory address is usually refering to a location in the RAM (of a given process) rather than a location on the physical disk.

@pcanal : Thanks for clarifying the language! I updated the post marked as the solution with your rephrasing.

I’m still learning a lot of the jargon surrounding memory / inodes / etc. I appreciate all your help!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.