Import a .hbook file in Python

pyAddicted · January 24, 2012, 3:06pm

Hi everybody!

I have a small problem. I’m analysing a group of .hbook files collected from someone else a few years ago. I was suggested to use PAW, but I feel more comfortable with Python. The problem is that the .hbook files are in binary form and I already spent two days looking how they are structured and still have no clue about them.
Yesterday I installed PyROOT and converted the files in .root: I can open them, see the internal structure, navigate inside, but can’t access the data.
I need to access the whole list of numbers in the file, it’s not important how, I just need to work on them directly with python. Does anyone know how to get these numbers with PyROOT? Alternatively, it’s also good if there is a way to access these data directly from the .hbook files or convert them in a more Python-friendly format (from inside or outside the Python script).

Thanks in advance for the help.

couet · January 24, 2012, 3:34pm

convert the .hbook files to .root files using h2root

pyAddicted · January 24, 2012, 3:57pm

Yes, I already did the conversion, but now I don’t know hot to get the data stored in the file.

What I managed to do is:

[code]import ROOT as r

f1 = r.TFile(‘filename.root’, ‘read’)
f2 = f1.Get(‘h1’)
f3 = f2.GetBranch(‘N1’)[/code]
I can see how many elements there are in f3 with f3.GetEntries(), but when I try to get a single entry with f3.GetEntry it always returns the same number, depending on the branch (every branch returns 4 except one that returns a number bigger than 30000, always the same).

wlav · January 24, 2012, 5:48pm

Hi,

GetEntry() called on a TTree (TNtuple) just reads the total number of bytes read. If the number is 4, then presumably you are reading back ints or floats. To access the data, access them as if the branch/leaf names signify data members on the tree (ntuple).

Cheers,
Wim

pyAddicted · January 24, 2012, 6:59pm

In reality for every entry I should read a list of 1536 integers, so I was working on the wrong branch.
That apart, I still don’t understand the structure of the data. The branch contains 202 entries, and that’s correct, but there seems to be only one leaf, with 1536 elements. I obtained these information from

[code]import ROOT as r

f1 = r.TFile(‘nomefile.root’, ‘read’)
f2 = f1.Get(‘h1’)
f3 = f2.GetBranch(‘Ivfas_data1’)
print f3.GetNleaves() #returns 1
print f3.GetEntries() #returns 202L
f4 = f3.GetLeaf(‘Ivfas_data1’)
print f4.GetLen() #returns 1536[/code]

Thanks a lot for the help, it’s already better. At least now I know where the data are.

wlav · January 24, 2012, 7:28pm

Hi,

you can use the TTree.Print() member function to see the structure and the TTree.Scan() function to see the full contents. Usually those two together are enough to figure out the rest.

Cheers,
Wim

pyAddicted · January 24, 2012, 9:08pm

Perfect!
I managed to print the desired elements on a file:

[code]import ROOT as r

f1 = r.TFile(‘nomefile.root’, ‘read’)
f2 = f1.Get(‘h1’).GetPlayer()
f2.SetScanRedirect(True)
f2.SetScanFileName(‘example.dat’)
f2.Scan(‘Ivfas_data1’, ‘’, ‘’, 1000000000, 0)[/code]

Is there also a way to directly import the numbers in a list in Python? Otherwise I’ll just do this operation in background and delete the file after its use.

Thanks again, you made me spare weeks of work!

wlav · January 24, 2012, 9:21pm

Hi,

the answer to the last question depends on what the actual structure is. In any case, iteration and append to a list is always possible.

If you can post the file, I can probably write an example script.

Cheers,
Wim

pyAddicted · January 24, 2012, 9:49pm

Thanks, but I managed to do all the work in an external module that only needs less than 3 seconds for 1000 measures.That’s more than enough for me.

If someone in the future will have the same problem, here is the code to solve it:

[code]# -- coding: utf-8 --
import os, subprocess
import ROOT as r

def hbook(hbook_file, hist_name = ‘h1’, branch_name = ‘Ivfas_data1’):
root_file = hbook_file[:5] + 'root’
subprocess.call([‘h2root’, hbook_file, root_file]) #Conversion of hbook file via h2root
f1 = r.TFile(root_file, ‘read’)
f2 = f1.Get(hist_name).GetPlayer()
f2.SetScanRedirect(True)
f2.SetScanFileName(‘data.temp’)
f2.Scan(branch_name, ‘’, ‘’, 1000000000, 0) #Put the data in a temp file
f1.Close()
os.remove(root_file) #Useless now

f1 = open('data.temp')
result = []
for line in f1:
	try:
		result.append(float(line.split('*')[-2]))
	except:
		pass
f1.close()
os.remove('data.temp')
return result[/code]

Edit: better version:

[code]# -- coding: utf-8 --
import os, subprocess
import ROOT as r
from scipy import matrix

def hbook(hbook_file, hist_name = ‘h1’, branch_name = ‘Ivfas_data1’):
root_file = hbook_file[:5] + ‘root’
subprocess.call([‘h2root’, hbook_file, root_file]) #Conversion of hbook file via h2root
f1 = r.TFile(root_file, ‘read’)
f2 = f1.Get(hist_name).GetPlayer()
f2_len = f2.GetEntries(’’)
f2.SetScanRedirect(True)
f2.SetScanFileName(‘data.temp’)
f2.Scan(branch_name, ‘’, ‘’, 1000000000, 0) #Put the data in a temp file
f1.Close()
os.remove(root_file) #Useless now

f1 = open('data.temp')
result = []
counter = 1
result.append([])
for line in f1:
	try:
		result[-1].append(float(line.split('*')[-2]))
		if counter < f2_len:
			counter += 1
		else: #in case there are more rows than entries -> every entry is a list
			counter = 1
			result.append([])
	except:
		pass
result = result[:-1]
if len(result) == 1:
	result = result[0]
else:
	result = matrix(result).T.tolist()

f1.close()
os.remove('data.temp')
return result[/code]