Tree writing speed and size

vaubee · February 23, 2015, 11:35am

Okay, I’ve made a nice post explaining my problem, but the forum decided I took to long and had me log in again, which meant I lost all the text I wrote. So here is a (much terser) re-write of my previous attempt:

We’ve experienced some issues with writing root-trees. The write speed is about 10 MB/s where the raid system that we’re writing to supports about 500 MB/s write speed.
To test this I wrote a small program writing 100M events to a tree that has two branches, each a simple double that is filled using rand().
Using the normal scheme of opening a root-file, creating a tree, and then filling it, it took about 54 seconds and created a 504 MB root-file.
Creating the tree in memory, filling it, and then opening a root-file and writing the tree to it takes about 21 seconds and creates a 1.5 GB root-file.
Looking at the Print statement for these two trees we noticed that the baskets sizes where 32 000 for the in-memory tree and 3 200 000 for the on-file tree.
So we set the basket size explicitly to 3 200 000 on creating the branch and re-ran the program. The speed and file sizes didn’t change, but the in-memory basket size was now 3 200 000. The on-file basket size was however 25 600 000 (a factor 8 larger). So the basket size is interpreted as either bits or bytes depending on whether the tree is create in-memory or on file?

And looking at the Print() statement of the two trees:
in-memory:

******************************************************************************
*Tree    :tree      : tree                                                   *
*Entries : 100000000 : Total =      3200045589 bytes  File  Size = 1600038353 *
*        :          : Tree compression factor =   2.00                       *
******************************************************************************
*Br    0 :x1        : x1/D                                                   *
*Entries :100000000 : Total  Size= 1600022629 bytes  File Size  =  800017319 *
*Baskets :      251 : Basket Size=    3200000 bytes  Compression=   2.00     *
*............................................................................*
*Br    1 :x2        : x2/D                                                   *
*Entries :100000000 : Total  Size= 1600022629 bytes  File Size  =  800017319 *
*Baskets :      251 : Basket Size=    3200000 bytes  Compression=   2.00     *
*............................................................................*
on-file:
******************************************************************************
*Tree    :tree      : tree                                                   *
*Entries : 100000000 : Total =      1600009277 bytes  File  Size =  529637922 *
*        :          : Tree compression factor =   3.02                       *
******************************************************************************
*Br    0 :x1        : x1/D                                                   *
*Entries :100000000 : Total  Size=  800004473 bytes  File Size  =  264817587 *
*Baskets :       47 : Basket Size=   25600000 bytes  Compression=   3.02     *
*............................................................................*
*Br    1 :x2        : x2/D                                                   *
*Entries :100000000 : Total  Size=  800004473 bytes  File Size  =  264818896 *
*Baskets :       47 : Basket Size=   25600000 bytes  Compression=   3.02     *
*............................................................................*

it is weird that the un-compressed size of the tree is twice as large if it is created in-memory compared to on-file.

Since the speed of writing the tree is so much larger when creating it in-memory, we would prefer to use that, but is there any way to reduce the file-size to that of the on-file created tree?

vaubee · February 23, 2015, 11:37am

Almost forgot, here’s the code of the program used to run these tests (with either line 9 or line 20 commented out):

#include <iostream>
#include "TFile.h"
#include "TTree.h"
#include "TStopwatch.h"

int main(int argc, char** argv) {
  //TFile f("test4.root","recreate");
  TTree tree("tree","tree");
  double x1,x2;
  tree.Branch("x1",&x1,"x1/D",3200000);//,1024*1024*10);
  tree.Branch("x2",&x2,"x2/D",3200000);//,1024*1024*10);
  TStopwatch w; 
  for(int i = 0; i < 100000000; ++i) { 
    x1 = i+rand()/2147483; 
    x2 = i+rand()/2147483; 
    tree.Fill(); 
  } 
  TFile f("test3.root","recreate");
  f.cd();
  tree.Write();
  f.Close();
  std::cout<<w.RealTime()<<std::endl;

  return 0;
}

Wile_E_Coyote · February 23, 2015, 1:52pm

You don’t write which ROOT version you are using (neither which operating system / compiler) so, if it’s old enough then [url=https://root-forum.cern.ch/t/tfile-speed/17549/1 old “TFile Speed” thread[/url] may be relevant.

vaubee · February 23, 2015, 2:16pm

Sorry, I’m using root 5.34/24 running on CentOs 6.6 kernel 2.6.32-504.1.3.el6.x86_64.

Also, I don’t think that this is really an issue of writing speed, I’ve created the same 100M events tree in-memory without writing a file, and that takes about 15 seconds. This means the writing (total time is 21 seconds) takes 6 seconds for a 1.5 GB file. This write rate of 250 MB/s is quite okay, my question with the previous post is more why the in-memory created tree takes so much more space on file and why it is a factor 2 bigger than the tree created on-file.

pcanal · March 2, 2015, 6:07pm

Hi,

The difference is (almost) solely in whether the data is compressed or not.

One confusing part was that TTree::Print in the case of the in-memory tree that was later written to disk was reporting twice as much data as really processed and written … but counted the written data only once resulting in a ‘false’ report of compression factor 2 … when the compression factor in reality was 1! This problem has now been fixed in the v5.34 and v6.02 patch branch as well as in the master.

The default compression level when the TTree is attached to a TFile when created is ‘1’. When the TTree is created ‘in-memory’ the default compression level is zero (no compression).

In your case this means that the TTree when created in memory was much faster because it did not compress the data but, of course, resulted in a much larger file, due the factor 3 in compression available when compressing.

In addition, when writing the TTree in memory the basket optimization is not enable and thus the baskets are not resized to match the TTreeCache buffer size (but this is a secondary effect in your case). This explains the differences is end-result basket sizes.

To speed-up processing (at the expense of disk size), you can disable the compression in the case where the TTree is connected to a file by doing

TFile f("test4.root","recreate"); f.SetCompressionLevel(0);

Cheers,
Philippe.

vaubee · March 2, 2015, 10:06pm

Okay, so the Print output of the trees was wrong, that does explain some of the issues.

Creating the tree on file without any compression of the file does indeed increase the size of the file to the same 1.5 GB. The speed of this is almost the same as creating the tree in memory and then writing it, 20-21 seconds compared to 18-19 seconds (compared to 54 seconds when creating the tree on a compressed file).

Just as a reference, running bzip2 or gzip on the final 1.5 GB files takes much longer than the 30ish seconds time difference between writing compressed or uncompressed:
bzip2 of the 1.5 GB root-file takes 144 seconds with a final file size of 362 MB.
gzip of the 1.5 GB root file takes 233 second with a final file size of 444 MB.

So what is the best way to speed up writing? Use multiple threads to write to multiple files?

Because these results suggest to me that it’s not the actual writing that takes long, but compressing the data to be written. And since filling the tree has to be done on a single thread (right?), it seems that writing to different files in separate threads would be the only way to make use of the write speed of our raid.

pcanal · March 10, 2015, 6:40pm

Cheers,
Philippe.

vaubee · March 11, 2015, 7:57pm

I’ve tried running the tutorials, but it seems that the final “mergedClient.root” file contains a tree with only the last 1M events (of one of the clients).
Apparently the fastMergeServer only keeps the very last input it received, instead of adding everything.

I’ve also tried putting the code into one single program by creating the fastMergeServer on one thread and the clients on two other threads, but this crashes with varying error messages (mostly connected to TVirtualStreamerInfo). I found one other post about writing to multiple files on multiple threads, and it seems this might have to do with meta data information not being thread-safe?

So is there a way to get the fastMergeServer to write all data to file, and can this be put into one single program?

pcanal · March 11, 2015, 8:21pm

[quote] by creating the fastMergeServer on one thread and the clients on two other threads, [/quote]Did you call TThread::Initialize(); to enable thread support in ROOT?

Cheers,
Philippe.

pcanal · March 11, 2015, 8:59pm

Indeed there was a deficiency in fastMergeServer.C:[code]diff --git a/tutorials/net/fastMergeServer.C b/tutorials/net/fastMergeServer.C
index a2558c2…dccf26f 100644
— a/tutorials/net/fastMergeServer.C
+++ b/tutorials/net/fastMergeServer.C
@@ -118,7 +118,7 @@ void fastMergeServer(bool cache = false) {
delete transient;
transient = new TMemFile(filename,mess->Buffer() + mess->Length(),length);
mess->SetBufferOffset(mess->Length()+length);

```
    merger.OutputFile(filename);
```

    merger.OutputFile(filename,"UPDATE");
    merger.AddAdoptFile(transient);

    merger.PartialMerge(TFileMerger::kAllIncremental);[/code]

The default for TFileMerger::OutputFile is over-write the output (and hence discard the previous accumulated values).

Cheers,
Philippe.

vaubee · March 11, 2015, 9:52pm

Thank you, Philippe.

Changing the TFileMerger::OutputFile to use “update” has indeed solved the issue of overwriting the data.

I had not included the TThread::Initialize() statement. Including it, the program still crashes most of the time, and even when it runs for a little while, there are error statements along the line of
Error in TFile::ReadKeys: reading illegal key, exiting after 0 keys

The error messages aren’t always the same, but they all seem to centre on access to the root-file?

pcanal · March 11, 2015, 9:58pm

Hi,

Why version of ROOT are you using? How did you arrange for the transfer of the TMemFile from the worker thread to the merging thread?

Cheers,
Philippe.

vaubee · March 11, 2015, 10:39pm

I’m using root 5.34/24 running on CentOs 6.6 kernel 2.6.32-504.1.3.el6.x86_64.

I’ve also tried running the code on MacOs 10.10.2 with root 6.03/02, in that case I always get the error
*** Assertion failure in +[NSUndoManager _endTopLevelGroupings], /SourceCache/Foundation/Foundation-1152.14/Misc.subproj/NSUndoManager.m:340

I’ve also attached the program in question, in case you want to test it.
thread_test.cc (7.75 KB)

pcanal · March 11, 2015, 11:22pm

Hi,

In addition, you need to create one TThread object per thread to fully enable the thread local mechanism (essential at the moment; this requirement will be eventually be lifted in v6).

So adding:TThread *th = new TThread();at the beginning of each of the spawed-into-a-thread functions will solve the problem.

Cheers,
Philippe.

vaubee · March 12, 2015, 12:15am

Thanks, I didn’t know that was needed.

This does improve things, though it does still crash on occasion. And once I reduced the number of times the clients sent data back to the server, the processing time improved a lot. It went from 5 minutes down to less than 30 seconds for 40M events.

pcanal · March 12, 2015, 12:53am

[quote]This does improve things, though it does still crash on occasion.[/quote]Do you have the stack traces for one of those failures?

Thanks,
Philippe.