Read not affected by SetCacheSize()

I have a rather complicated analysis involving reading two Trees (a,b) from a TFile and writing a third, generated TTree © to an other TFile. Most of my analysis time is spent on I/O, for it only uses 10-15% of the CPU.

In my analysis I loop over the a tree and inside this loop I loop over the b tree. B tree consists of only two branches, one holding ~16 MB array, the other beeing just an int. So I guess I should cache the b tree. So I did

	TTree *ftr = (TTree*)sourcefile->Get("frames_tree");
	cout << ftr->GetCacheSize() << endl;
	ftr->SetCacheSize(100000000);
	ftr->AddBranchToCache("*");
	cout << ftr->GetCacheSize() << endl;

Couts proove that the cache size has been changed. Then, inside the loop I do

		for (Long64_t nentry=0; nentry<fnentries;nentry++) 
		{
			ftr->LoadTree(nentry);		
			ftr->GetEntry(nentry);
			...
		}

A while later I test the caching with

And no matter how I change the cache size, if it is 100000000 or 10000, the result is similar. I even put the SetCacheSize just before the loop to ensure it is not taken over by the other tree, but no help… What am I doing wrong?

I am running root 5.32/0, however the TFile was generated by an older version of root.

[/code]

Hi,

To analyze what the TTreeCache is doing use:[code] TTreePerfStats *ps= new TTreePerfStats(“ioperf”, ftr);

  for (Long64_t nentry=0; nentry<fnentries;nentry++) 
  {
     ftr->LoadTree(nentry);      
     ftr->GetEntry(nentry);
     ...
  }

ps->SaveAs("ioperf.ps");
ps->Draw();
ps->Print("unzip");
T->PrintCacheStats();

[/code]

Cheers,
Philippe.

I attach the performance macro I get, since I don’t really know how to interpret it. Does it show, that after each entry, the cache is cleared and then next entry read to the cache (drops to zero in file position). Also, I am quite puzzled at what happens at entry 100.
ioperf.ps (31.2 KB)

Hi,

Could you also post the root file that was produced by TTreePerfStats and the text output that was printed on the screen?

Thanks,
Philippe

[quote]Also, I am quite puzzled at what happens at entry 100.[/quote]This is the end of the training period. Since you are explicitly setting up the list of branches, you should consider ending the training period early (i.e. right after the AddBranchToCache by calling:ftr->StopCacheLearningPhase();Note that during the training phase, there is no actual caching/prefetching.

The graph you send, shows that the TTreeCache does indeed reads in one go a large set of basket at entry 100 and 160. Also the file is likely to contain 2 or TTrees that are intertwinned on the file and may have been produced by an older version of ROOT (i.e. I see some ‘backwards’ seeks).

Cheers,
Philippe.

I’ve added stop learning just before the loop. The output is:

******TreeCache statistics for file: framesandstars_tree.root ******
Number of branches in the cache …: 2
Cache Efficiency …: 0.885714
Cache Efficiency Rel…: 0.083708
Learn entries…: 100
Reading…: 1692033504 bytes in 1072 transactions
Readahead…: 256000 bytes with overhead = 87 bytes
Average transaction…: 1578.389463 Kbytes
Number of blocks in current cache…: 15, total size: 99000249

And I attach the root file. Still it seems like it is learning for 100 events…
ioperf.root (9.88 KB)

Hi,

Can you send me the ROOT file that you are reading?

Thanks,
Philippe.

It’s about 2.9 GB… I put it on:

grb.fuw.edu.pl/lewhoo/framesandstars_tree.root

The tree I am trying to read cached is frames_tree.

OK, it seems that it works, except that it insist on learning the first 100 entries. However, afterwards I can clearly see a long reading phase, that very quick fit of many of the frames, than reading, etc…

Hi,

I can not reproduce the TTreeCache being inactive eventhough the Learning Phase has been stopped. Do you have a small complete example I could use to reproduce it? (I tried with both the trunk and v5.32).

I also understand the backward seeks that are left. Your data file has two unusual features: one of the two branches is much larger (300,000 times) and the AutoSave is set very low (only 8 entries … most likely because it was requested to be around 62Mb). This means that after each 8 entries, the TTree is save in the file. This result in the previous version being released in the file and the resulting gap is used to store all the baskets for the small branches (they are only 87 bytes each).

Cheers,
Philippe.

I’ll try to make a small example to reproduce my results, however it may not be so simple.

Meanwhile I got access to a much more powerfull hardware where I can store all the entries I need in the memory. The performance analysis shows a very similar plot to the one I got when not reading all the entries into memory. So, according to the plot… does it read all the entries? :slight_smile:

Btw. the cache becomes negative, when se to 102410241024*2.5
ioperf.root (10.7 KB)