Can I do somethig to improve reading speed?

I am currently using win32gdk v3.10/2. My applications read and write trees that have a small number (15-20) branches, each containing a single variable. However, my TTrees can be quite large - 50E6 entries or more.

My problem is that reading and writing these takes sooo long - I can only read at about 10 MB/sec from a local drive wheras reading directly from a binary file is typically 7 times faster… See attached example (you will need about 300 mb free space to run this) Below is the output on my machine (1.5GHz P4 w 1 GByte ram and older IDE drive) of the attached script, that I compile w/ACLiC.

[list] root [0] .L example.cpp++
Info in TWinNTSystem::ACLiC: creating shared library C:\temp\example_cpp.dll
s1f0.1_cint.cxx
s1f0.3_cint.cxx
s1f0.5_cint.cxx
s1f0.7_cint.cxx
s118_.cxx
Creating library C:\temp\example_cpp.lib and object C:\temp\example_cpp.exp
root [1] write()
Timing writing of tree
Real time 0:00:44, CP time 42.734
root [2] read()
Timing reading of tree
Real time 0:00:11, CP time 11.516
root [3] writebin()
Timing writing of binary file
Real time 0:00:13, CP time 4.328
root [4] readbin()
Timing reading of binary file
Real time 0:00:02, CP time 2.797
root [5][\list]
(note: example.bin is 1.65 times larger than example.root)

I’ve even attempted to “down-sample” the file - e.g. only read one record out every 100 or so to get a quick and dirty idea of whats there - that only reduces my time by a factor of about 5 - still not particularly quick…
Writing is also slow, but I am less concerned about this - I know root is busy compressiong etc… The big issues for me is the reading. IS there anything I can do to speed up i/o?

I suspect that typical HEP users of root would deal with “Wider and shallower” trees than I do - perhaps root is not meant to deal with “narrow but deep” trees… Thanks in advance!!
Ed
Here’s example.cpp referenced above:

#if !defined(__CINT__) || defined(__MAKECINT__) #include "TFile.h" #include "TTree.h" #include "TRandom.h" #include "TStopwatch.h" #endif void write() { // make a dummy tree to read TFile *f = new TFile("example.root","RECREATE"); TTree *t = new TTree("pt","example"); Float_t a,b,c,d,e; t->Branch("a",&a,"a/F"); t->Branch("b",&b,"b/F"); t->Branch("c",&c,"c/F"); t->Branch("d",&d,"d/F"); t->Branch("e",&e,"e/F"); TRandom r; printf("Timing writing of tree\n"); TStopwatch s; for(int i=0;i<10000000;i++) { a = (Float_t)i; b = 1000.*r.Rndm(); c = 1000.*r.Rndm(); d = b*b+c*c; e = 1.; t->Fill(); } s.Stop(); s.Print(); f->Write(); f->Close(); delete f; } void read() { // read the dummy tree made by write() TFile *f = new TFile("example.root"); TTree *t = (TTree *)f->Get("pt"); Float_t a,b,c,d,e; t->SetBranchAddress("a",&a); t->SetBranchAddress("b",&b); t->SetBranchAddress("c",&c); t->SetBranchAddress("d",&d); t->SetBranchAddress("e",&e); printf("Timing reading of tree\n"); TStopwatch s; for(int i=0;i<t->GetEntries();i++) { t->GetEntry(i); } s.Stop(); s.Print(); delete f; } void writebin() { // make a dummy binary file - same contents as tree made w/ write() FILE *fp = fopen("example.bin","wb"); TRandom r; printf("Timing writing of binary file\n"); TStopwatch s; Float_t a[5]; for(int i=0;i<10000000;i++) { a[0] = (Float_t)i; a[1] = 1000.*r.Rndm(); a[2] = 1000.*r.Rndm(); a[3] = a[1]*a[1]+a[2]*a[2]; a[4] = 1.; fwrite(a,sizeof(Float_t),5,fp); } s.Stop(); s.Print(); } void readbin() { // read the dummy binary made by writebin() FILE *fp = fopen("example.bin","r"); TRandom r; printf("Timing reading of binary file\n"); TStopwatch s; Float_t a[5]; for(int i=0;i<10000000;i++) { fread(a,sizeof(Float_t),5,fp); } s.Stop(); s.Print(); }

Ed,

Results on my machine Windows/XP, P IV 2.4 Ghz.
As you can see your test is slower when writing (real time)

Rene

1-> WITH COMPRESSION

root [4] write()
Timing writing of tree
Real time 0:00:26, CP time 24.625
root [5] read()
Timing reading of tree
Real time 0:00:07, CP time 6.940
root [6] writebin()
Timing writing of binary file
Real time 0:00:41, CP time 3.976
root [7] readbin()
Timing reading of binary file
Real time 0:00:04, CP time 4.296

2-> NO COMPRESSION (buffer size 32k)

root [1] write()
Timing writing of tree
Real time 0:00:26, CP time 9.113
root [2] read()
Timing reading of tree
Real time 0:00:04, CP time 4.527

3->NO COMPRESSION (buffer size 64k)
root [1] write()
Timing writing of tree
Real time 0:00:19, CP time 8.803
root [2] read()
Timing reading of tree
Real time 0:00:04, CP time 4.887