Running on a chain of TClonesArray files

Hi,

I need to run over a large chain of files (~1000) containing TClonesArray branches. My reconstruction code can take either a single file or a chain as input, and produces an output root-tuple (one per input). I was wondering if it is more efficient (in terms of speed) to chain the files together and feed them to my code in one shot, or just process them one at the time.

ie.

(a)

for(int i = 1; i <= N_of_files; ++i)
{
  chain->Add(input_file_i);
  MyClass * my_class = new MyClass(chain);
  my_class->Loop();
  my_class->Output(output_root_file_i);
}

vs.

(b)

for(int i = 1; i <= N_of_files; ++i)
  chain->Add(input_file_i);

MyClass * my_class = new MyClass(chain);
my_class->Loop();
my_class->Output(output_one_large_root_file);

I need to mention that since these are big files, I/O throughput rates could be an issue here; I’m also interested to hear if this is mostly a matter of taste or if e.g. chaining large files together could slow things down.

Thanks!

–Christos

PS
% which root
/d0usr/products/root/Linux-2-4/v3_05_07d_rh71_locked-GCC_3_1–exception–opt–thread/bin/root

but I’m willing to move to a different version if that makes a difference

Only method b makes sense.
In method a, as you show it, you will reprocess multiple times the files
already in teh chain

Rene

I’m sorry, I did not make myself clear.

Case (a) should’ve been:

for(int i = 1; i <= N_of_files; ++i) 
{
  TChain * chain = new TChain("tree"); 
  chain->Add(input_file_i); 
  MyClass * my_class = new MyClass(chain); 
  my_class->Loop(); 
  my_class->Output(output_root_file_i); 
} 

So, I really process only one file (index: i) at the time.

–Christos

Method b will e faster than your loop. In addition your loop has
two memory leaks
-the chain
-the Myclass object

Rene

Good! Thanks, this is what I was asking for.

[quote=“brun”] In addition your loop has
two memory leaks
-the chain
-the Myclass object
Rene[/quote]

Yeah, I know that. It was not a complete example.

Thanks again.