Convert root file to something readable by python

Hello all,
I’ve been given a project in which I am supposed to implement several machine learning algorithms on data obtained by another group. However, the data is in root format, and I am trying to convert it into something more easily read by machine learning algorithms. I have tried to print the various graphs to txt files using a command I found on these forums, hp0->Print(“all”); dump.txt
where hp0 is the name of the graph and dump.txt is an empty txt file in the same directory, but I get the error “undeclared identifier dump”.
I have also tried the dump command in root, but it gives less than useful information. It outputs:

root [2] ==> Dumping object at: 0x00005578ccd0a590, name=TFrame, class=TFrame

fBorderSize 1 window box bordersize in pixels
fBorderMode 0 Bordermode (-1=down, 0 = no border, 1=up)
*fTip ->0 ! tool tip associated with box
fX1 0 X of 1st point
fY1 -0.228786 Y of 1st point
fX2 20 X of 2nd point
fY2 0.265047 Y of 2nd point
fResizing false ! True if box is being resized
fUniqueID 0 object unique identifier
fBits 0x03000008 bit field status word
fLineColor 1 Line color
fLineStyle 1 Line style
fLineWidth 1 Line width
fFillColor 0 Fill area color
fFillStyle 1001 Fill area style

So, I would like to know how I should convert the file.
Thanks in advance!

_ROOT Version:6
Platform: Ubuntu20.04
_Compiler: NA


I think @etejedor and @moneta can give some advice for this

depending on the type of data you want to convert, there may be a couple of solutions available from Go-HEP:

  • a TTree (if its structure is compatible) can be converted to a numpy array file via root2npy
  • a TTree can also be converted to arrow via root2arrow
  • TH{1,2}X and TGraph{,{,Asymm}Errors} can be converted to text (using the YODA ASCII format) via root2yoda

binaries for Go-HEP are available here:

alternatively, there is also the uproot4 pure-Python package that can read/write ROOT files.

hth,
-s

Hi,
with native ROOT you can use RDataFrame’s AsNumpy method, here is a tutorial.

Cheers,
Enrico

Hello,

With ROOT, you can dump the contents of a ROOT file into NumPy arrays with the AsNumpy method of RDataFrame. Here’s a tutorial:

https://root.cern/doc/master/df026__AsNumpyArrays_8py_source.html

Basically you would create an RDataFrame from a tree in one or more files, then use AsNumpy to store in NumPy arrays the columns (branches) of the dataset that you would like to have. What you get from AsNumpy is a dictionary with one key-value for each column name - NumPy array for that column.

Thanks for the help
I’ve been trying to use the root2npy command, but it does not seem to produce the file. Here is the output

./root2npy -f /home/octopus/Documents/Work/Thesis/Data/ColliderSimulation/LHC1/test023.root -t GammaC -o /home/octopus/Documents/Work/Thesis/Data/ColliderSimulation/LHC1/output.npz
root2npy: scanning leaves...
root2npy: scanning leaves... [done]
panic: rtree: unknown Tree implementation *rtree.tntuple

goroutine 1 [running]:
go-hep.org/x/hep/groot/rtree.newReader({0x7f0798bf29f8, 0xc0001c71e0}, {0xc0001d1680, 0x5, 0x8}, 0x51109f, 0xc000140000, 0x19)
	/home/octopus/go/pkg/mod/go-hep.org/x/hep@v0.29.2/groot/rtree/reader.go:261 +0x1a8
go-hep.org/x/hep/groot/rtree.NewReader({0x7f0798bf29f8, 0xc0001c71e0}, {0xc0001d1680, 0x5, 0x8}, {0x0, 0x0, 0x0})
	/home/octopus/go/pkg/mod/go-hep.org/x/hep@v0.29.2/groot/rtree/reader.go:66 +0x199
main.process({0x7ffd530097e7, 0x4b}, {0x7ffd5300978c, 0x4d}, {0x7ffd530097dd, 0x6})
	/home/octopus/Documents/Work/Thesis/Data/ColliderSimulation/LHC1/test.go:199 +0x43a
main.main()
	/home/octopus/Documents/Work/Thesis/Data/ColliderSimulation/LHC1/test.go:160 +0x185

The contents of the ROOT file are:

TFile**		test023.root	
 TFile*		test023.root	
  KEY: TNtuple	v2;471	v2 [current cycle]
  KEY: TNtuple	v2;470	v2 [backup cycle]
  KEY: TNtuple	GammaA;226	GammaA [current cycle]
  KEY: TNtuple	GammaA;225	GammaA [backup cycle]
  KEY: TNtuple	GammaC;1	GammaC
  KEY: TProfile	hp0;1	hp0
  KEY: TProfile	hp1;1	hp1
  KEY: TProfile	hp2;1	hp2
  KEY: TProfile	hp3;1	hp3
  KEY: TProfile	hp4;1	hp4
  KEY: TProfile	hp5;1	hp5

thanks for the report.

do you have a small ROOT file that I could play with?

No, thank you for the help
The files I have are between 10-25 gb, so I suppose that would not work. I am somewhat new to root, is there a way I could move an TNtuple to its own file and try to upload that?

nevermind, I have a fix :slight_smile:

(I’ll wait for the tests to complete and post a binary for you to test somewhere)

here it is:

I’ll cut a new Go-HEP release tomorrow morning (CERN time).

hth,
-s

@sbinet TNtupleD, too?

not in that PR (yet).

I’ll add support for TNtupleD shortly.

@Wile_E_Coyote done. (I’ve also updated the previous binary with TNtupleD support. hth)

Thanks for the file!
However, sadly, now I’m running into a different error.

./root2npy -f /home/octopus/Documents/Work/Thesis/Data/ColliderSimulation/LHC1/test023.root -t GammaC -o /home/octopus/Documents/Work/Thesis/Data/ColliderSimulation/LHC1/output.npz
root2npy: scanning leaves...
root2npy: scanning leaves... [done]
panic: reflect: slice index out of range

goroutine 1 [running]:
reflect.Value.Index({0xaf51c0?, 0xc0000ac7f8?, 0x68358b?}, 0x444e73?)
	/home/binet/sdk/go/src/reflect/value.go:1366 +0x16d
github.com/sbinet/npyio/npy.shapeFrom({0xaf51c0?, 0xc0000ac7f8?, 0x42ebd6?})
	/home/binet/work/gonum/pkg/mod/github.com/sbinet/npyio@v0.5.2/npy/writer.go:520 +0x185
github.com/sbinet/npyio/npy.Write({0xdfef80, 0xc0001dd710}, {0xaf51c0?, 0xc0000ac7f8?})
	/home/binet/work/gonum/pkg/mod/github.com/sbinet/npyio@v0.5.2/npy/writer.go:40 +0x1a7
github.com/sbinet/npyio.Write(...)
	/home/binet/work/gonum/pkg/mod/github.com/sbinet/npyio@v0.5.2/npyio.go:125
main.process({0x7ffdca9d87df, 0x4b}, {0x7ffdca9d8784, 0x4d}, {0x7ffdca9d87d5, 0x6})
	/home/binet/work/gonum/src/go-hep.org/x/hep/cmd/root2npy/main.go:225 +0xb69
main.main()
	/home/binet/work/gonum/src/go-hep.org/x/hep/cmd/root2npy/main.go:160 +0x185

Sorry for all the trouble

@Tmikhail no trouble at all, it’s useful to get Go-HEP more robust and battlefield-tested.

here is another binary:

that should get me more informations as to what is happening.

Here is the output of the new script:

./root2npy -f /home/octopus/Documents/Work/Thesis/Data/ColliderSimulation/LHC1/test023.root -t GammaC -o /home/octopus/Documents/Work/Thesis/Data/ColliderSimulation/LHC1/output.npz
root2npy: scanning leaves...
root2npy: scanning leaves... [done]
root2npy: could not get shape from []float32: []
err: reflect: slice index out of range
panic: reflect: slice index out of range [recovered]
	panic: reflect: slice index out of range

goroutine 1 [running]:
github.com/sbinet/npyio/npy.shapeFrom.func1()
	/home/binet/work/gonum/src/github.com/sbinet/npyio/npy/writer.go:528 +0x125
panic({0xb00d20, 0xdff530})
	/home/binet/sdk/go/src/runtime/panic.go:838 +0x207
reflect.Value.Index({0xaf5520?, 0xc00000c828?, 0x2?}, 0xc000026e98?)
	/home/binet/sdk/go/src/reflect/value.go:1366 +0x16d
github.com/sbinet/npyio/npy.shapeFrom({0xaf5520?, 0xc00000c828?, 0x42eed6?})
	/home/binet/work/gonum/src/github.com/sbinet/npyio/npy/writer.go:531 +0x205
github.com/sbinet/npyio/npy.Write({0xdffee0, 0xc0001c9710}, {0xaf5520?, 0xc00000c828?})
	/home/binet/work/gonum/src/github.com/sbinet/npyio/npy/writer.go:41 +0x1a7
github.com/sbinet/npyio.Write(...)
	/home/binet/work/gonum/src/github.com/sbinet/npyio/npyio.go:125
main.process({0x7ffd31cb37df, 0x4b}, {0x7ffd31cb3784, 0x4d}, {0x7ffd31cb37d5, 0x6})
	/home/binet/work/gonum/src/go-hep.org/x/hep/cmd/root2npy/main.go:225 +0xb69
main.main()
	/home/binet/work/gonum/src/go-hep.org/x/hep/cmd/root2npy/main.go:160 +0x185

@Tmikhail ok.
that’s what I thought.

let’s move this conversation over there:

(if you have a github account, that is. otherwise, let’s continue privately with direct messages via root-forum)

Btw this should work in ROOT:

import ROOT
dictionary_of_numpy_arrays = ROOT.RDataFrame("GammaC", "test023.root").AsNumpy()

Cheers,
Enrico

For archives’ sake, this was fixed w/ sbinet/npyio@v0.6.0 and collected into Go-HEP@v0.30.1.

one can download the binaries from:

(or directly using the Go SDK)