Split large root tree file to small root tree files

raly · May 5, 2020, 11:10pm

Dear experts,

I have a root tree file which contain a large number of entries, when I run my analysis code in the file it takes long time and then my job killed by condor.
I would like to split the large root file to smaller root files.

any suggestions?

Thanks in advance.

Regards
Reham

Please read tips for efficient and successful posting and posting code

ROOT Version: Not Provided
Platform: Not Provided
Compiler: Not Provided

Wile_E_Coyote · May 6, 2020, 9:11am

Maybe you could consider another approach.
Many ROOT methods offer two parameters “Long64_t nentries = kMaxEntries” and “Long64_t firstentry = 0”, see e.g.: TTree::Draw
So, maybe you could add such options to your “analysis code” and then you could easily create many jobs with limited “nentries”, each starting at a different “firstentry”.

BTW. I’m not sure I know any easy way to “split” trees but maybe @sbinet has something.

sbinet · May 6, 2020, 9:52am

nope.

well, groot certainly has all the building blocks to implement a root-split command, but this hasn’t been packaged up yet:

groot/rtree#Copy allows to copy entries from an input tree to an output one,
groot/rtree#Reader allows to read spans of entries

so it shouldn’t be too hard to whip something up.

sbinet · May 6, 2020, 1:41pm

done: https://godoc.org/go-hep.org/x/hep/groot/cmd/root-split

$> go get go-hep.org/x/hep/groot/cmd/root-split
$> root-split -h
Usage: root-split [options] file.root

ex:
 $> root-split -o out.root -n 10 ./testdata/chain.flat.1.root

options:
  -n int
    	number of events to split into (default 100)
  -o string
    	path to output ROOT files (default "out.root")
  -t string
    	input tree name to split (default "tree")
  -v	enable verbose mode

Example:

$> root-ls -t ../../testdata/simple.root
=== [../../testdata/simple.root] ===
version: 60600
  TTree   tree      fake data (entries=4)
    one   "one/I"   TBranch
    two   "two/F"   TBranch
    three "three/C" TBranch

$> root-dump ../../testdata/simple.root
>>> file[../../testdata/simple.root]
key[000]: tree;1 "fake data" (TTree)
[000][one]: 1
[000][two]: 1.1
[000][three]: uno
[001][one]: 2
[001][two]: 2.2
[001][three]: dos
[002][one]: 3
[002][two]: 3.3
[002][three]: tres
[003][one]: 4
[003][two]: 4.4
[003][three]: quatro

$> root-split -n 2 -v ../../testdata/simple.root
root-split: splitting [0, 2) into "out-0.root"...
root-split: splitting [0, 2) into "out-0.root"... [ok]
root-split: splitting [2, 4) into "out-1.root"...
root-split: splitting [2, 4) into "out-1.root"... [ok]

$> root-dump out*.root
>>> file[out-0.root]
key[000]: tree;1 "fake data" (TTree)
[000][one]: 1
[000][two]: 1.1
[000][three]: uno
[001][one]: 2
[001][two]: 2.2
[001][three]: dos
>>> file[out-1.root]
key[000]: tree;1 "fake data" (TTree)
[000][one]: 3
[000][two]: 3.3
[000][three]: tres
[001][one]: 4
[001][two]: 4.4
[001][three]: quatro

You can easily download standalone binaries (i.e. you don’t need Go installed on your machine) for selected platform+OS combinations from https://go-hep.org/dist (e.g. choosing the latest version).

raly · May 6, 2020, 2:27pm

Thanks a lot for your help … But could you tell me how can I use “go” in cern lxplus?

sbinet · May 7, 2020, 10:11am

sorry, I didn’t notice your reply…

you don’t need Go installed (besides, the Go compiler installed on lxplus is a bit old: 1.8, 3yrs)
just the binary.
(so, you could actually cross-compile it from your linux/windows/macos machine and ship it to lxplus)

here is one I’ve just compiled for you:

~binet/public/root-split-linux-amd64.exe

hth,
-s

system · May 21, 2020, 10:15am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.