Chaining (or something else) TRFIOFiles

Hi All,
I have ~100 60M files (each storing 1000 TObjects) stored in CASTOR (eg a whole dataset). I’d like to be able to give my macro the data set name and for it to do something like:

TChain * myChain = new TChain(“datasetName”, “”);
myChain->Add("/castor/cern.ch/user/m/metson/ASAP/dataset/*.root")

This however doesn’t seem to work (when I ls() the chain I get no output). My first guess is because the files should be accessed as TRFIOFile not TFile; my second is that the TObjects should be held in a TTree for chaining.

I can, but prefer not to, put these TObjects into a TTree if there is no way to make a chain (or chain-like thing with them), but I can’t take all the files out of CASTOR to use them as TFiles. So my questions are: how do I chain TRFIOFile objects? Can I make a chain of TObjects that aren’t in a TTree?
Cheers
Simon

If your files contain a TTree, you can do:

TChain * myChain = new TChain("treeName", ""); myChain->Add("rfio:///castor/cern.ch/user/m/metson/ASAP/dataset/*.root");
otherwise you can use the TDSet class and do

TDSet * myds = new TDSet("datasetName", ""); myset->Add("rfio:///castor/cern.ch/user/m/metson/ASAP/dataset/*.root");

I VERY STRONGLY recommend the approach with TChain (but you must
have TTrees inside)

Rene

Ok great, thanks Rene. Aside from conveiniance what are the benefits of trees? I assume they make IO easier…

Trees have many advantages. Just to list a few:
-There is a query language (see TTree::Draw). Most of the time
no need to write C++ code to histogram data.
-You read only the strict necessary (thanks to the split mode), ie
processing is much faster
-Collections of Trees (TChains) can be processed as if they were
one single Tree.
-Data compaction is in general better.
-You can browse the info in the Tree with the browser
-Trees can be used in parallel mode with PROOF

Rene

Ok I have my objects in a TTree now, so how do I access the object? I’d like to do something like:

But it looks like the actual object has been split up more than I want - I want a TTree with leaves MyEvent objects not a tree with leaves of the MyEvents data members (I think). If I set split level to 0 I have the MyEvent objects as leaves:

leaf->GetTypeName() (const char* 0x9e2de9c)"MyEvent"

How should I get to the MyEvent object held in the leaf, a cast doesn’t crash but the MyEvent object holds garbage.

The root file resulting from my orca job is in ~metson/public as is the library you’ll need to read the MyEvent objects etc. The code that makes the root file is here

Oh yes, the MyEvent object is defined in here

[quote]But it looks like the actual object has been split up more than I want - I want a TTree with leaves MyEvent objects not a tree with leaves of the MyEvents data members (I think).[/quote]Unless the data members are completely meaningless on their own, we recommend to split the object (so that the data can possibly be read partially and so that the compression can be more effective (because like object/value are stored together). In addition, the proper level of splitting can be decided upon carefull performance study of the real usage case (i.e. try a real example and vary the split level, etc).

To retrieve the object, do not manipulate the leaf directly (unless you spend time understand the internals of TTree). Instead simply use:

MyEvent *evt=0; thetree->SetBranchAddress("evt159200011",&evt); thetree->GetEntry(entrynumber); ... now evt points to the correct MyEvent ....

Also the name of your branch (evt159200011) makes me think that you might be missing the point of TTree. In general you should have one branch (possibly split) for MyEvent and many entry (one per event instance). It might be that in your case, you have one branch per event instance (which defeats the purpose of TTree). [You may want to re-read the User’s Guide chapter on TTree].

Cheers,
Philippe