Hi,
I want to compare the content of two ROOT files and found this old post from 2017. Has there been any progress on this? Don’t want to go and reinvent the wheel…
Cheers,
Ben
Hi,
I want to compare the content of two ROOT files and found this old post from 2017. Has there been any progress on this? Don’t want to go and reinvent the wheel…
Cheers,
Ben
Hello,
just wanted to say that I found a very efficient way to solve this using RDataFrame and numpy. It requires ROOT >= 6.18.00
from ROOT import RDataFrame
import numpy as np
#setup rdataframes
rdf1 = RDataFrame(ttree1, file1)
rdf2 = RDataFrame(ttree2, file2)
#convert rdataframes to numpy arrays
numpyArray1 = rdf1.AsNumpy()
numpyArray2 = rdf2.AsNumpy()
#compare the dataframes
np.allclose(numpyArray1, numpyArray2)
The only problematic part is if you have ROOT objects in the branches, which are not or cannot be converted to Python types properly. I am working on solutions here, though.
using root-diff (from GoHEP), referenced in the post you linked above, should work as well 
(user class branches are supported since then)
(I’d be interested in knowing what are the types that aren’t handled properly by Python too)
Hi @sbinet,
thank you for your reply. For the type that I had most problems with, please see
Another format that was not handled properly was the C++ type long, which needed the same conversion via a new branch as mentioned in the quoted post: rdf.Define("x_new", "int(x)").
Cheers,
Ben
ok. then groot/root-diff wouldn’t have any issue with these types.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.
We have development merged related to this: https://github.com/root-project/root/pull/4253 I.e. TFiles can now be made binary identical, starting ROOT v6.20.
Could you share a use case where you need to compare to files (not two branches - that I could imagine) for identity +/- some precision?
Dear Axel,
the use case is that we have an n-tuple production software. Now, say, we are updating our n-tuple creation software and this effects the n-tuple, obviously. We now want to know, what changes are done to our n-tuple exactly, in order to make sure that we did not introduce a bug somewhere and that we understand our changes properly. Also, often we are changing things that should not really change the n-tuple outputs (optimisations etc.) and we want to make sure that this does not change our n-tuple. Therefore, we want to compare the n-tuple created with our latest n-tuple software to a reference n-tuple.
I hope this is useful. I can provide you with some example files, if needed.
Cheers,
Ben
Thanks! Attached the script I developed and used to do the comparison back then (may not be bug free).
compareTest.py (5.8 KB)