Can somebody provide me with a link to some documentation/discussion about basket size and its dependence on # of leaves, disk access & CPU price, program memory. Is the # of branches relevant (e.g. will two branches with different basket sizes or number of leaves be saved at different times, can I sync them, etc)?
PS I found an older post on this subject (from 1998, I think), but things must have evolved since.
The fundamentals of basket size optimization has not changed.
Each end branch has a basket of the size provided in the branch construction (default 32K). Expect in the case of the branch created
by a list of leaves, each end branch has exactly on leaf.
When a basket is full (this is data dependent), it is flushed to disk and emptied (in memory).
Each branch’s ‘save cycle’ is independent of the others.
There are no real advantage (that I can think of) in sync-ing the basket of the branches expect if you want to check-point the whole tree (i.e. leave the file in a fully readable state in case of a crash). This can be done via the AutoSave mechanism (or direct call to Write). However this ‘check-pointing’ is relatively expansive (the TTree object and the current content of the basket is written to disk).