Size enhancement when adding a new branch to existing tree

Hi,
I am trying to add a new branch including about 26 leaves to an existing tree in one file. The new branch is added by ‘struct’. I almost followed steps addressed in the user guide to add the branch. However, after adding the new branch, the size of the file is increased largely. For example, my original file size is about 6.3MB, after adding, it becomes 11MB. In the original file, there are branches much more than what I added, I cannot well understand fewer branches occupying more space… I simply do at the beginning in the way:

  OptVars myopt; //OptVars is my struct variable
  TBranch * newbr = tree->Branch("OptVars", &myopt.OrgNuPz_S1, "OrgNuPz_S1/D:OrgNuPz_S2:....");

I guess I need to specify the compress factors?

Cheers,
Zhiyi.

Could you post:
-the definition of your structure OptVars
-the result of tree->Print()

Rene

The struct:

  struct OptVars {
    double OrgNuPz_S1;
    double OrgNuPz_S2;
    double OptJet1NuPz_S1;
    double OptJet1NuPz_S2;
    double OptJet1TopMass_S1;
    double OptJet1TopMass_S2;
    double OptJet1TM_Posterior_S1;
    double OptJet1TM_Posterior_S2;
    double OptJet2NuPz_S1;
    double OptJet2NuPz_S2;
    double OptJet2TopMass_S1;
    double OptJet2TopMass_S2;
    double OptJet2TM_Posterior_S1;
    double OptJet2TM_Posterior_S2;
    double OptJet3NuPz_S1;
    double OptJet3NuPz_S2;
    double OptJet3TopMass_S1;
    double OptJet3TopMass_S2;
    double OptJet3TM_Posterior_S1;
    double OptJet3TM_Posterior_S2;
    double OptJet4NuPz_S1;
    double OptJet4NuPz_S2;
    double OptJet4TopMass_S1;
    double OptJet4TopMass_S2;
    double OptJet4TM_Posterior_S1;
    double OptJet4TM_Posterior_S2;
  };

diff_print.txt (1.2 KB)
tree_print_small.txt (7.95 KB)

Your new branch OptVars has only 20 entries where all other branches have 3776 entries. Your new branch occupies only 7978 bytes in memory with no basket written to the file so far.
I have the impression that you have called tree.Fill instead of branch.Fill to fill the new branch.

Rene

Calculating values is very slow. So I just fill the first 20 entries. I thought in principle it should be same as entire entries. Now I fill the branch with all entries. Please see the below. The file size is 11MB while the original file size is 6.3MB. The file with 20 entries has the same size with all entries filled.

This branch is only 68 Kbytes in the file. Are you sure that you do not fill other branchse (or worst the full tree again). see my previous message

Rene

Okay, I upload my code here for reference. By the way, the root version I am using is: v4_04_02.
AddOptVarsToSingleTopTopoTree.cpp (15.1 KB)

In your code you keep opening the file in update mode and for each run you save a new copy of the Tree header (tree.Write).
Your tree has only a few entries (3776) and you have 556 branches. All the data for each branch fits in memory, your branch basket size being 32 KBytes.
If you look at the top of the result of TTree::Print, you will see that
each header with all the branches in memory are about 15 MBytes of non compressed data and about 4.2 MBytes of compressed data.
Probably you have run your program in update mode a few times, explaining why the file size grew up to 63 Mbytes because you have several copies of the Tree header. You can check this point by calling TFile::ls

Rene

Sorry, Rene, I still cannot understand what is going on here.

  1. My original file size is 6.3 MB. It becomes 11 MB after adding my new branch.
  2. I browse the web page of root documents, there is a method called TFile::Is() in the class of TFile, however, I don’t have this method in my root command line.

Even I use the newer version of root (v5.16). The method TFile::Is() is only available in a root version newer than v5.16?

  1. I followed the steps to add a branch to an existing tree in the root user guide:
void tree3AddBranch() {
TFile f("tree3.root","update");
Float_t new_v;
TTree *t3 = (TTree*)f->Get("t3");
TBranch *newBranch = t3-> Branch("new_v",&new_v,"new_v/F");
//read the number of entries in the t3
Int_t nentries = (Int_t)t3->GetEntries();
for (Int_t i = 0; i < nentries; i++){
new_v= gRandom->Gaus(0,1);
newBranch->Fill();
}
t3->Write("",TObject::kOverwrite); // save only the new version of the tree
}

I cannot see any difference in principle except I am using struct to add the branch. If is the problem from the struct?

  1. As you addressed, the uncompressed size is much larger than the compressed one, so I still think I should compress the branch. But I don’t know how.

Thanks,
Zhiyi.

Hi Rene, in order to make life simpler, I just used the example code I posted above to add ONE branch to my existing tree. The interesting thing is that addition of the branch likewise increased largely my file size from 6.3MB to 11 MB. I really cannot understand what cause this enhancement on the file size.

The test above is based on the old version of root v4.04, however, if I use the newer root v5.16 and use the example code and the same original file, the file size drops largely, e.g., the modified file size is almost same as the original size. So the point must be due to bug(s) in the old root version. But for my case, since D0 framework is still based on root v4.04 and I have to code within D0 framework, I don’t have any clue how to handle this issue in the old root.

Read carefully my mail above. Repeat the operation without adding your branch, but still saving the Tree header to the file.

Rene

Yes, I repeated the operation without adding any branch. The file size still goes up. Since it is due to the fact that I may write tree header to the file a few times, I guess the enhancement amount of the size is not proportional with the increasing of entries. If that is the case, I won’t worry about it too much since only 5 MB difference no matter how many events in the file. But for the new version of root, it seems that I don’t have this problem.

Thank you for all your information.

If you do not have many entries in your Tree, you can reduce the branch basket size (eg 4000 instead of 32000) and your Tree header will be 8 times smaller.

Rene