Data Structure compatible with hadd to store arrays of numbers

hep_physics · June 13, 2017, 5:42pm

I get a bunch of numbers per event and would like to store them.

I have implemented a tree to do it. But what I noticed was, hadd takes incredibly large amount of time, of the order of 10 hours for what used to take just under a minute, when there’s no tree. the populated data is also not much (<50MB with tree) per file over 100s of files.

With tree : < 50MB per file, hadd : 10+ hours
Without tree < 1 MB per file , hadd : < 1 minute

Is there an alternative data structure that I can use or a way to reduce the time it takes to hadd the files ?

pcanal · June 15, 2017, 6:27am

hadd is usually very fast even with TTrees … What command line do you use to invoke hadd? Can you share a few of your files for us to try to reproduce this problem? Is the problem linear? (i.e. i you have 100 files does it take 10 times more time than to hadd 10 files?)

hep_physics · June 15, 2017, 10:01pm

Indeed it was unusual for the time it took. What you said is right, it increases linearly. I use
*hadd -f output.root .root

Here are the files: https://www.dropbox.com/sh/a5wdf4x2hq3iakq/AACTr4rzoUJzMfxgU3KuHBn7a?dl=0

system · June 29, 2017, 10:01pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.