Fastest way of reading all events of a TTree

John · April 6, 2004, 11:01pm

I need to read the content of a large number of fairly simple TTree files, where each file has about 500 events and each event has about 400 - 500 data values. I need to read the values and write them out in a different format. Any advice on how to read the values fast. Thanks.

pcanal · April 7, 2004, 1:31pm

Hi John,

I am not sure what you want to optimize when you say ‘fast’.

Anyway, I suspose that you can start by using TTree::MakeClass to generate a code-skeleton that you would modify to create and fill a new tree with your new format.

Cheers,
Philippe.

John · April 9, 2004, 4:06am

Philippe,

Someone else has create the ROOT file (TTree basically), I simply need to read the files. Since I have many many of such files to read, I would like to read them as fast as possible. I am trying to dump the content to another format (basically raw binary). Is there anything to optimize ?
Most types of values seem to be straightforward, how about strings ?

John

pcanal · April 9, 2004, 4:35am

Out of curiosity, why/how do you use this raw binary format?

So indeed, use TTree::MakeClass to create a code skeleton.
In the Loop method of the skeleton, there is a for loop where you can read the data and then dump any which way you see fit.

Then to handle the numerous file, just bundle them in a TChain object (see its documentation for details but basically a TChain acts like a TTree but span over multiple files).

Cheers,
Philippe

John · April 9, 2004, 4:21pm

Philippe,

Thanks for the information.

I am reading TTrees (STAR *.tags.root files) to build external indices for the tags. This is used to search for events satisfying conditions on tags, part of Grid Collector project http://crd.lbl.gov/~kewu/gc-proc.html.

We need to read all tag files as they are generated (hundreds of thousands). Currently, it takes about a minute to read one such file, which contains about 500 or so events (each with about 500 or so variables). Since the file size is less than 300 KB typically, it should not take more than a second. Any suggestion about the problem with speed ?

In addition, how much advantage can be gained by working on multiple files at one using TChain ?

John · April 9, 2004, 4:24pm

Sorry, the URL is crd.lbl.gov/~kewu/ps/gc-proc.html

John · April 9, 2004, 4:59pm

Philippe,

One important detail tha I have forgot to mention in the previous messages. The tag files generated at different time have slightly different number of leaves. So far as I can tell, each variable is in its own branch. This means that I can not use the code generated by TTree::MakeClas. What we are doing right now is basically using a combination of GetListOfLeaves and getValuePointer to retrieve values of each variable.

I imagine it would be better to read all records of one variable (one branch) in one shot. Is there such a function ? What is the best way to do this ? When I am done with a branch, is it better to close the branch in some way to avoid the buffer related to the branch being kept in memory ? Does this affect the speed of reading at all ?

pcanal · April 9, 2004, 5:19pm

Hi John,

Could you provide me with a concrete example so that I can reproduce this slow I/O problem?

Cheers,
Philippe.

PS. Could you also try to test your program when the data files are on a SCSI disk instead of an IDE disk?

John · April 9, 2004, 5:26pm

Philippe,

I am in the process of rewriting the code to read the tag files – that’s why I am trying to get some advices… Hopefully, in a day or two I will have some test code to play with…

pcanal · April 9, 2004, 9:29pm

Hi John,

On a related note, we are hoping to introduce later this year a similar ‘indexing’ scheme for TTrees (Bit Slice).

Cheers,
Philippe

John · April 10, 2004, 5:44am

Here is my current test program. It seems to do a lot better than the previous version. Do you spot anything that might cause performance problems ? Thanks.

PS: do you have any description about the bit slice index under development ?
readTag.cpp (8.59 KB)

pcanal · April 10, 2004, 3:13pm

Hi,

To be accurate I would need one of your data file.

It is possible that it would to be more efficient to read a full entry at a time rather than one branch at a time.

Also if the process is not I/O bound (I am not sure yet), you would need to remove as much as of the calculation from the loop. I.e the inside of the loop should basically just be

xxx->GetEntry(i); yyy->push_back(*val);

Cheers,
Philippe.

John · April 12, 2004, 3:22pm

Hi, Philippe,

Here is a sample data file. I am working on a version that reads all variables of one event in one shot and will see how that goes…

John
st_P03ih_4080089_raw_0040175.tags.root (281 KB)

John · April 12, 2004, 10:10pm

Hi, Philippe,

Here is my latest version of the test code. It now has a second read function readEvent (the previous one has been renamed readBranch). The function main new calls gettimeofday to time the two read function. Note that both version reads the content of root file into memory. I need the content to be in memory in preparation for the next steps. In readEvent, I can not use the code generated from MakeClass because the number of leaves and the number of variables might change.

One usually thing is that the relative perfomance of the two functions appears to depend when they are invokded. Here is what I mean. I can give the same file twice on the command line, the first time these files are read, readEvent takes less time than readBranch, but the time required by readBranch will decrease to be less than that of readEvent the second time the file is read. For readEvent, the time required in both cases are about the same. This appears to suggest that I should use the first version readBranch. What do you think? Do you spot anything unusual ?
readTag.cpp (19.6 KB)

John · April 12, 2004, 10:46pm

Here is an file containing a screen dump of three consecutive tests. It shows that all three tests read the same file three times.
To avoid the need to actuall download the file, I am including the text here as well. The only difference is that the text here has different line breaks.

: time readTag st_P03ih_4080089_raw_0040175.tags.root st_P03ih_4080089_raw_0040175.tags.root st_P03ih_4080089_raw_0040175.tags.root
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readBranch(st_P03ih_4080089_raw_0040175.tags.root) took 0.239277 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readEvent(st_P03ih_4080089_raw_0040175.tags.root) took 0.155497 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readBranch(st_P03ih_4080089_raw_0040175.tags.root) took 0.144944 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readEvent(st_P03ih_4080089_raw_0040175.tags.root) took 0.165296 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readBranch(st_P03ih_4080089_raw_0040175.tags.root) took 0.176268 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readEvent(st_P03ih_4080089_raw_0040175.tags.root) took 0.189823 sec
0.960u 0.210s 0:02.68 43.6% 0+0k 0+0io 1504pf+0w
: time readTag st_P03ih_4080089_raw_0040175.tags.root st_P03ih_4080089_raw_0040175.tags.root st_P03ih_4080089_raw_0040175.tags.root
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readBranch(st_P03ih_4080089_raw_0040175.tags.root) took 0.309928 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readEvent(st_P03ih_4080089_raw_0040175.tags.root) took 0.191959 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readBranch(st_P03ih_4080089_raw_0040175.tags.root) took 0.185533 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readEvent(st_P03ih_4080089_raw_0040175.tags.root) took 0.1896 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readBranch(st_P03ih_4080089_raw_0040175.tags.root) took 0.177546 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readEvent(st_P03ih_4080089_raw_0040175.tags.root) took 0.190082 sec
1.020u 0.290s 0:01.59 82.3% 0+0k 0+0io 1504pf+0w
: time readTag st_P03ih_4080089_raw_0040175.tags.root st_P03ih_4080089_raw_0040175.tags.root st_P03ih_4080089_raw_0040175.tags.root
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readBranch(st_P03ih_4080089_raw_0040175.tags.root) took 0.221441 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readEvent(st_P03ih_4080089_raw_0040175.tags.root) took 0.188628 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readBranch(st_P03ih_4080089_raw_0040175.tags.root) took 0.179438 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readEvent(st_P03ih_4080089_raw_0040175.tags.root) took 0.183459 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readBranch(st_P03ih_4080089_raw_0040175.tags.root) took 0.161727 sec
tag file “st_P03ih_4080089_raw_0040175.tags.root” contains 632 entries
readTag: readEvent(st_P03ih_4080089_raw_0040175.tags.root) took 0.192985 sec
0.970u 0.290s 0:03.97 31.7% 0+0k 0+0io 1504pf+0w

pcanal · April 13, 2004, 6:04pm

Hi John,

For your purpose you can greatly improve the performance (since your file are relatively small and you any want to keep them in memory … assuming you have enough RAM, of course) by doing:

TFile *tf = new TFile(argv[i]); TTree *tt = reinterpret_cast<TTree*>(tf->Get("Tag")); tt->SetMaxVirtualSize( 512 * 1024 * 1024 ) ; // default is 64000000

This will tell the tree to cache in memory the content of the branches (instead of keeping just one buffer per branch).

This depends on many factors (OS disk cache, hardware disk cache, pattern of access, etc…) and the test you did does not seem to reflect the actual use cache you are going to have ( open one file, read its content, write output file (well maybe), open another file, read its content) etc… So it is hard to draw any conclusion for it.

Cheers,
Philippe.

John · April 13, 2004, 6:57pm

Philippe,

Thanks for the suggestion. Increasing the buffer size helps to increase the CPU usage from 40% to about 90% in a number of tests I have done so far. There is little difference to read data one event at a time or one branch at a time. Based on this I will choose to the use the version that read one branch at a time becaus the code is slightly simpler. You have been very helpful and I am very grateful for your help.

brun · April 17, 2004, 10:04am

John,

I have looked at your code and see that your logic is the most possible
inefficient thing that one can do when reading Trees!
-setting the branch address at each event!
-having switch/case statements when this job is already done by ROOT for you
-creating short arrays with new

Just to give you an idea, I gave measured the time to read your Tag ntuple with 632 entries via a MakeClass generated code (not necessarily the best one). It takes 0.02 seconds to read your Tag Tree. To check this, just do the following with one of your Tag Trees.
root > TFile f(“st_xxx.root”);
root > Tag.MakeClass(“Tag”);

in Tag.C after theloop, add
printf(“Read %d entries and %d bytes\n”,nentries,nbytes);
then in a new session do
root > .L Tag.C
root > Tag t
root > t.Loop()

Rene

John · April 19, 2004, 5:46am

Rene,

Thanks for your interests in my effort to read these tag files. This is my first time dealing with any root files directly. That is why I wanted to get some advices from the experts. I probably should have started my question with the requirements that I am facing. There are two basic requirements: the tag files have different number of leaves, and the contents of the files have to be transposed and written to files. Here are more details.

One of the restrictions that I probably did not mention in the first few postings was that the number of leaves (and the number of variables in the leaves) are not fixed for all the tag files that I have to deal with. The code generated from MakeClass can not be used. A safe thing to do was to open the file, figure out the leaves and dynamically allocate space for the variables. Dynamic memory allocation is also necessary for a second reason.

The second reason is that I have to write out the content of the root file in another format, which basically transposing the data from a “row oriented format” to a “column oriented format”. Since the tag files are relatively small, the most efficient option should be to transpose the data in memory. In particular, the output of the “column oriented format” is written to many files (one per variable). On most systems, it would be inpractical to open all the files at the same time. This also makes it necessary to store the content of the tag file in memory.

Since the files are relatively small, the suggestion to increase the virtual memory size to allow the content of the root file to be read into memory cache is a reasonable thing. I imagine that there is no way to do any better than that. However, I would not be surprised to be proven wrong.

brun · April 19, 2004, 8:21am

John,

ROOT Tree files are “column-oriented”, not “row-oriented” !
At the Tree creation time, you can optimize the Tree/Branch storage
in view of future queries

by allocating large buffer sizes for the branch(es) the most
used in queries
by storing these branches directly to separate files
(see TBranch::SetFile)

It is totally unrealistic to assume that you can fit one branch in memory.
The TAG files might be small, but you will have zillions of these files.
You need an automatic disk overflow mechanism in writing and reading.

Rene