Split large root tree file to small root tree files

Dear experts,

I have a root tree file which contain a large number of entries, when I run my analysis code in the file it takes long time and then my job killed by condor.
I would like to split the large root file to smaller root files.

any suggestions?

Thanks in advance.

Regards
Reham


Please read tips for efficient and successful posting and posting code

ROOT Version: Not Provided
Platform: Not Provided
Compiler: Not Provided


Maybe you could consider another approach.
Many ROOT methods offer two parameters “Long64_t nentries = kMaxEntries” and “Long64_t firstentry = 0”, see e.g.: TTree::Draw
So, maybe you could add such options to your “analysis code” and then you could easily create many jobs with limited “nentries”, each starting at a different “firstentry”.

BTW. I’m not sure I know any easy way to “split” trees but maybe @sbinet has something.

nope.

well, groot certainly has all the building blocks to implement a root-split command, but this hasn’t been packaged up yet:

so it shouldn’t be too hard to whip something up.

done: https://godoc.org/go-hep.org/x/hep/groot/cmd/root-split

$> go get go-hep.org/x/hep/groot/cmd/root-split
$> root-split -h
Usage: root-split [options] file.root

ex:
 $> root-split -o out.root -n 10 ./testdata/chain.flat.1.root

options:
  -n int
    	number of events to split into (default 100)
  -o string
    	path to output ROOT files (default "out.root")
  -t string
    	input tree name to split (default "tree")
  -v	enable verbose mode

Example:

$> root-ls -t ../../testdata/simple.root
=== [../../testdata/simple.root] ===
version: 60600
  TTree   tree      fake data (entries=4)
    one   "one/I"   TBranch
    two   "two/F"   TBranch
    three "three/C" TBranch

$> root-dump ../../testdata/simple.root
>>> file[../../testdata/simple.root]
key[000]: tree;1 "fake data" (TTree)
[000][one]: 1
[000][two]: 1.1
[000][three]: uno
[001][one]: 2
[001][two]: 2.2
[001][three]: dos
[002][one]: 3
[002][two]: 3.3
[002][three]: tres
[003][one]: 4
[003][two]: 4.4
[003][three]: quatro

$> root-split -n 2 -v ../../testdata/simple.root
root-split: splitting [0, 2) into "out-0.root"...
root-split: splitting [0, 2) into "out-0.root"... [ok]
root-split: splitting [2, 4) into "out-1.root"...
root-split: splitting [2, 4) into "out-1.root"... [ok]

$> root-dump out*.root
>>> file[out-0.root]
key[000]: tree;1 "fake data" (TTree)
[000][one]: 1
[000][two]: 1.1
[000][three]: uno
[001][one]: 2
[001][two]: 2.2
[001][three]: dos
>>> file[out-1.root]
key[000]: tree;1 "fake data" (TTree)
[000][one]: 3
[000][two]: 3.3
[000][three]: tres
[001][one]: 4
[001][two]: 4.4
[001][three]: quatro

You can easily download standalone binaries (i.e. you don’t need Go installed on your machine) for selected platform+OS combinations from https://go-hep.org/dist (e.g. choosing the latest version).

Thanks a lot for your help … But could you tell me how can I use “go” in cern lxplus?

sorry, I didn’t notice your reply…

you don’t need Go installed (besides, the Go compiler installed on lxplus is a bit old: 1.8, 3yrs)
just the binary.
(so, you could actually cross-compile it from your linux/windows/macos machine and ship it to lxplus)

here is one I’ve just compiled for you:

  • ~binet/public/root-split-linux-amd64.exe

hth,
-s

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.