Modifying TTree in place changes file size

Dear ROOTers,

I try to modify an entry in a TTree in place and write it back to the same file (I know, I know, that’s not how it should be done in ROOT, but no choice in this case). The problem is, that almost each time I modify the TTree, the file size grows. I know about the backup cycle, but I guess it should not cause growing every time the tree is modified.

Here is how I modify the TTree:

auto f = new TFile("f.root", "update");
auto t_old = (TTree*)f->Get("t");
t_old->SetName("t_old");
t_old->SetBranchStatus("a", 0);

int a=0, b=0;

auto t = t_old->CloneTree(0);
t->SetName("t");
t->Branch("a", &a, "a/I");

a = 2;

t->Fill();

delete t_old;

t->Write("", TObject::kWriteDelete);
f->Close();

I attach scripts for both creation of the event and modification of it.

How to reproduce:

  1. Run: root -l -q create_event.C
  2. Check the f.root size
  3. Run root -l -q modify_event.C
  4. Check the f.root size
  5. Keep running modify_event.C and checking sizes

Am I doing something wrong?

create_event.C (222 Bytes)
modify_event.C (312 Bytes)


ROOT Version: 6.30.09
Platform: Fedora 40
Compiler: Not Provided


Maybe @pcanal can take a look and give some hints

See below for the detailed answer but in the end the essential question is “why” do you need to modify the file in place instead of simply creating a new file and possibly simply renaming it to the old name. The answer to that question will guide whether the cost (writing the correct code to really match the need of the “why”) and the risk (possibility of losing the original data if anything goes wrong in the update code) are really worth it or even really needed.

The code in the first post create a new TTree which the same structure, no data except for one entry of the branch ‘a’ (and no deletion of the existing data for the branch a) so the file will grow with at least that additional data.

In addition:

t->Write("", TObject::kWriteDelete);

does “write object, then delete previous key with same name” and thus the file will also grow of the size of the TTree meta-data.

You *could* write code to find on the file all the baskets related to the branch a release the space of those baskets on the file to add it to the list of free block, remove the branch a from the list of branch of the tree (or maybe clear it, that may work or may not work), add a new branch a and then store the new data for branch and update the meta-data. In the new file the space that was freed for the old baskets may or may not be large enough to hold the new data for the branch a depending on the compression ratio and thus of the values and the new meta data may or may not fit in the space used by the old meta-data. (And I am most likely forgetting some devil-is-really-in-the-details issues).

It is much simpler to just do:

auto f_old = new TFile("f.root", "read");
auto t_old = f->Get<TTree>("t");
t_old->SetBranchStatus("a", 0);
auto f = new TFile("f-new.root", "RECREATE");
auto t = t_old->CloneTree(0);
t->Branch("a", &a, "a/I");
for(Long64_t e = 0; e < t_old->GetEntries(); ++e) 
{
   t_old->GetEntry(e);
   a = 2; // assign the correct value for each entries
   t->Fill();
}
f->Write();
delete f_old
delete f;
// If we want to look like we did it in-place (at the expense of losing data if there is a bug)
gSystem->Unlink("f.root");
gSystem->Rename("f-new.root", "f.root");

If write performance (i.e. the speed of this operation) is more important that read speed (using the file for plotting), we can also use this alternative:

auto f_old = new TFile("f.root", "read");
auto t_old = f->Get<TTree>("t");
t_old->SetBranchStatus("a", 0);
auto f = new TFile("f-new.root", "RECREATE");
auto t = t_old->CloneTree(-1, "fast");
auto b = t->Branch("a", &a, "a/I");
for(Long64_t e = 0; e < t_old->GetEntries(); ++e) 
{
   a = 2; // assign the correct value for each entries
   b->BackFill();
}
f->Write();
delete f_old
delete f;
// If we want to look like we did it in-place (at the expense of losing data if there is a bug)
gSystem->Unlink("f.root");
gSystem->Rename("f-new.root", "f.root");
1 Like

Thanks! The idea of creating a new file and then renaming it to the old file’s name is what I was going to do, but first I wanted to check if perhaps I am doing something wrong.

While I understand that this would be of a very low priority, perhaps the ROOT team could consider in-file replacement of a TTree in a simple way? My bet is that I am not the only person that would benefit from this.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.