HDF5 to root

A few months back there was a request in this forum for a tool that would convert HDF5 files to root format. I had to solve the problem myself, and I offer the solution to the following problem: convert a rank 1 HDF5 dataset of a compound HDF5 data type made of primitive types to a root tree.

The code follows. It handles only an HDF5 dataset right under “/”, but the limitation is easily removed (but I have’t done it).

//  Giuseppe Vacanti (cosine science & computing bv)
//  April 23, 2004
// 
//   $Id: .emacs,v 1.9 2004/04/17 18:42:11 gvacanti Exp $

#include<string>

#include<H5Cpp.h>
#include<iostream>
#include<cassert>
#include<vector>

#include "root/TROOT.h"
#include "root/TFile.h"
#include "root/TNtuple.h"

namespace {

  char map_h5type_to_root(H5::DataType type) {
    
    if(type == H5::PredType::NATIVE_SHORT){
      return 'B';
    }
    if(type == H5::PredType::NATIVE_USHORT){
      return 'b';
    }
    if(type == H5::PredType::NATIVE_INT){
      if(type.getSize() == 16)
	return 'S';
      else
	return 'I';
    }
    if(type == H5::PredType::NATIVE_UINT){
      if(type.getSize() == 16)
	return 's';
      else
	return 'I';
    }
    if(type == H5::PredType::NATIVE_LONG){
      return 'I';
    }
    if(type == H5::PredType::NATIVE_ULONG){
      return 'i';
    }
    if(type == H5::PredType::NATIVE_FLOAT){
      return 'F';
    }
    if(type == H5::PredType::NATIVE_DOUBLE){
      return 'D';
    }
    
    bool h5_predtype_not_known = false;
    assert(h5_predtype_not_known);
  }
};


int main(int argc, char * argv[]) {
  
  using namespace std;
  using namespace H5;

  // h5toroot <hdffile> <hdf5 dataset name> <root file name>
  if(argc != 4){
    cout << "Usage: " << argv[0] << " <hdffile> <hdf5 dataset name> <root file name>\n";
    exit(1);
  }


  const string tablename(argv[2]);
  const string filename(argv[1]);
  const string rootfile(argv[3]);
    

  H5File h5 = H5File(filename, H5F_ACC_RDONLY);
  Group root = h5.openGroup("/");
  DataSet ds = root.openDataSet(tablename);

  DataSpace dsp = ds.getSpace();
  cout << "Found the table " << tablename << "\n";
  cout << "Rank: " << dsp.getSimpleExtentNdims() << "\n";
  if(dsp.getSimpleExtentNdims() != 1){
    cout << "Cannot handle tables with rank != 1.";
    exit(1);
  }

  const hssize_t nrecs = dsp.getSimpleExtentNpoints();
  const CompType type = ds.getCompType();
  const int nm = type.getNmembers();
  const size_t twidth = type.getSize();

  TFile * rfile = new TFile(rootfile.c_str(),
			    "RECREATE",
			    "Dump of HDF5 file");
  TTree * rtree = new TTree("table", "table");
  

  string description;

  vector<size_t> offsets(nm);
  vector<char> rflags(nm);
  for(int k = 0; k < nm; ++k) {
    offsets[k] = type.getMemberOffset(k);
    rflags[k] = map_h5type_to_root(type.getMemberDataType(k));
    description += type.getMemberName(k) + "/" + rflags[k] + ":";      
  }
  description.erase(description.size() - 1);
  
  hsize_t dims[] = { 1 };
  hsize_t count[] = { 1 };
  hssize_t offset[] = { 0 };
  DataSpace mem(1,dims);
  hssize_t start[] = { 0 };
  hssize_t end[] = { 0 };

  char * data = new char[twidth];
  rtree->Branch("data", data, description.c_str());
  
  for(size_t k = 0; k < nrecs; ++k){
    dsp.selectHyperslab(H5S_SELECT_SET, count, offset, 0, dims);
    ds.read(data, type, mem, dsp);
    rtree->Fill();
    offset[0] += count[0];
  }
  rfile->Write();
}

I got these messages:

g++ -Wall -O2 -o hdf2root -I/usr/local/include -I/opt/cern/root/include -L/usr/local/lib

-L/opt/cern/root/lib hdf2root.cxx
hdf2root.cxx: In function ‘int main(int, char**)’:
hdf2root.cxx:120: error: invalid conversion from ‘hssize_t*’ to ‘const hsize_t*’
hdf2root.cxx:120: error: initializing argument 3 of ‘void H5::DataSpace::selectHyperslab
(H5S_seloper_t, const hsize_t*, const hsize_t*, const hsize_t*, const hsize_t*) const’
hdf2root.cxx:113: warning: unused variable ‘start’
hdf2root.cxx:114: warning: unused variable ‘end’

I named it hdf2root.cxx.

I’d appreciate any help possible.

Try taking these lines:

hsize_t dims[] = { 1 }; hsize_t count[] = { 1 }; hssize_t offset[] = { 0 }; DataSpace mem(1,dims); hssize_t start[] = { 0 }; hssize_t end[] = { 0 };

and changing them like so:

hsize_t dims[] = { 1 }; hsize_t count[] = { 1 }; hsize_t offset[] = { 0 }; DataSpace mem(1,dims); // hssize_t start[] = { 0 }; // hssize_t end[] = { 0 };

Bear in mind I know nothing of this particular project, and I have not tried compiling the code.

Hope that helped.

  • Peter

Howdy:

change

hssize_t offset[] = { 0 };

to

hsize_t offset[] = { 0 };

to make the compilation error go away.

Giuseppe

Hello,
Seven years after, and this is the only solution I have found so far to convert from hdf5 files to root’s.
However, we are using an openmpi version of hdf5 which doesn’t contain C++ support. Only C and Fortran.
Does any of you know about a little program like this written for the C hdf5 API?
hdfgroup.org/HDF5/doc/H5.int … Intro-APIs

Thank you very much.

Hi,

ROOT only has a C++ interface, however you can usually create an extern “C” wrapper function that is both accessible from any C compiled code and compiled in C++.

Cheers,
Philippe.

Hello,
My problem is not the Root’s c++ interface, I am pretty used to it.
The thing is that the HDF5 software distributed with Ubuntu doesn’t provide the C++ interface, only Fortran and C. (Remark: I mean the openmpi version of HDF5.)
Anyway, I could install HDF5 from source with C++ support and thus, I was able to finally compile the hdf2root.cc program posted here.
However, it is not working good with the data files I wanted to convert.
I can give more details if someone is interested.
Cheers.

Hi,

You want to re-read (and maybe use the terms to search in Google) Philippe’s post. As long as you are using the same compiler tools (e.g., gcc and g++) for both, then you should have no problem making a c++ function that can be called from C (i.e., You put the declaration of the C++ wrapper function in an extern C block so that it has a C name and not C++ and can therefore be called from C and Fortran).

Good luck,
Charles

Hello,
Thanks for the reply, but I think I didn’t explain properly my problem.

  • We would like to convert data files from hdf5 to root format.
  • The program proposed in this post uses the C++ interface to HDF5.
  • The HDF5 version that we have doesn’t support C++ (there is no C++ support for the MPI I/O version of HDF5).
  • I wonder if there is a program around to convert from hdf5 to root which uses C HDF5 interface instead of the C++ one.

Any help or advice?
Thanks!

PS: I have also found this linux.softpedia.com/get/Utilitie … 1518.shtml .
But again, it uses the C++ the interface to HDF5.

[quote=“delaossa”]Hello,
Thanks for the reply, but I think I didn’t explain properly my problem.

  • We would like to convert data files from hdf5 to root format.
  • The program proposed in this post uses the C++ interface to HDF5.
  • The HDF5 version that we have doesn’t support C++ (there is no C++ support for the MPI I/O version of HDF5).
  • I wonder if there is a program around to convert from hdf5 to root which uses C HDF5 interface instead of the C++ one.

Any help or advice?
Thanks!

PS: I have also found this linux.softpedia.com/get/Utilitie … 1518.shtml .
But again, it uses the C++ the interface to HDF5.[/quote]

I think we did get your explanation. We are (trying to) tell you to write a C++ function (declared in an extern “C” block that you can call from openmpi. This C++ wrapper function will be able to call Root.

Cheers,
Charles

Hi,

See www2.research.att.com/~bs/bs_faq2.html#callCpp

Philippe.

Hi,
Excuse me, please, if I am being annoying, but I know that C++ functions can be called from C and viceversa.
However, I will never be capable to use the Giuseppe Vacanti’s program to convert hdf5 files to root ones if I don’t have neither the C++ headers nor the C++ libraries for HDF5:
H5Cpp.h and libhdf5_cpp.so don’t exist in my system, but hdf5.h and libhdf5.so.

Cheers, Alberto

[quote=“delaossa”]Hi,
Excuse me, please, if I am being annoying, but I know that C++ functions can be called from C and viceversa.
However, I will never be capable to use the Giuseppe Vacanti’s program to convert hdf5 files to root ones if I don’t have neither the C++ headers nor the C++ libraries for HDF5:
H5Cpp.h and libhdf5_cpp.so don’t exist in my system, but hdf5.h and libhdf5.so.

Cheers, Alberto[/quote]

Yes, you won’t be able to use his program. But you ought to be able to use it to figure out how to write your own. If you don’t want to do that, then maybe your effort would best be spent finding the libraries so that his will work for you as is.

Cheers,
Charles

Ok! That’s what I thought… thanks.
Anyway, I managed to compile C++ libraries for HDF5 and then, I could finally run this program as it is (pure C++).
However, it is not working properly with my data files and I don’t know why.
It fails in line 89:

  const int nm = type.getNmembers();

throwing the following message:

[quote]HDF5-DIAG: Error detected in HDF5 (1.8.7) thread 0:
#000: H5Tfields.c line 90 in H5Tget_nmembers(): cannot return member number
major: Invalid arguments to routine
minor: Inappropriate type
#001: H5Tfields.c line 131 in H5T_get_nmembers(): operation not supported for type class
major: Invalid arguments to routine
minor: Inappropriate type
terminate called after throwing an instance of ‘H5::DataTypeIException’
Abort[/quote]
Here I attach the file which I am trying to convert:
gamma.h5.tar.gz (842 Bytes)
It is a very simple one with just 1 dataset called ‘gamma_|charge|’ under the main ‘/’ group.
Any clues?

Hello all,
After several days struggling with the HDF5 to ROOT conversion, I have created a program based on the original one posted by Giuseppe Vacanti:hdf2root.tar.gz (3.74 KB)
It converts the whole HDF5 structure into TTrees:
For every Group in the HDF5 file, hdf2root creates a TTree (named as the orginal Group) which stores the info contained in the original DataSets. Moreover, the Attributes of the Group are stored in another TTree (named as the original group plus “_att”), together with the Attributes of every DataSet in the Group.
The program is thought to work in any kind of HDF5 file and make a “general” conversion.
It works in all the data files I have tested, which amounts to 4 or 5 different types (only!).
A clear limitation would be how to handle more complex DataTypes. Up to know, only simple types are supported, but this could be enough for many users (it is actually enough for the files I have to handle!).
The usage is quite simple:

Usage: hdf2root [--seq] <inFile.h5> <outFile.root(default=inFile.root)>
 
            --seq  Enables sequential mode.

hdf2root just needs the original HDF5 file as input to create a root file with the same name. If you want a different output name, put it in as a second argument. If the –seq is set, hdf2root create the TTrees for the DataSets in sequential mode. This means that the first dimension of the data set is used like an “event index” (if you know what I mean…).

I aim you (interested people) to take a look and send some feedback.
One could start with the example file I posted in a previous post in this thread (gamma.h5).

Warning: I am only experiencing problems with the conversion of hdf5_strings mostly present in Attributes.
The size of these objects is not 1 to 1 with their length and, as a result, larger strings of characters are stored in root unnecessarily. This could be a problem of the particular hdf5 files I am using, instead of the converter itself.

I hope it helps to the community.

Cheers,
Alberto

Hi all,

This also looks promising: amas.web.psi.ch/tools/H5root/index.html

I have not tried it yet, but there is an interactive tool as well as a library that you can use to open/read/plot hdf5 files from the root prompt…

Hi delaossa,

I tried to compile the program you attached but I get the msg below. I am guessing that I am missing mpif77. I tried to install that through yum searching for openmpi but none of the packages I found help. Do you know if mpif77 is compatible with SLC6?

======================================
[wht34@icarus hdf2root]$ make
make: mpif77: Command not found
Linking hdf2root
hdf2root.o: In function main': /home/wht34/MyCodes/hdf2root/hdf2root.cpp:394: undefined reference toH5check_version’
/usr/lib/gcc/x86_64-redhat-linux/4.4.7/…/…/…/…/lib64/libhdf5_cpp.so: undefined reference to H5Tunregister' /usr/lib/gcc/x86_64-redhat-linux/4.4.7/../../../../lib64/libhdf5_cpp.so: undefined reference toH5set_free_list_limits’

Hello,

Try this new version:hdf2root.tar.gz (4.86 KB)
You will need a root installation and the HDF5 libraries for C++.
There are a configuration script where you can define the paths to the needed libraries.
Let me know wether you succeed to compile and or not.

Cheers,
Alberto

Hi Alberto,

I modified the env.sh file to point to my root installation directory. I am not sure if I pointed the hdf5 interface correctly but they’re the only hdf5 libraries I found on my computer. Make is still complaining about
undefined reference to `H5check_version’ and so on with all these H5 commands being undefined.

Thanks,

Wing

======================
#!/bin/bash

Setting up Root environment…

#export ROOTSYS=/usr/local/root
export ROOTSYS=/usr/share/root
export DYLD_LIBRARY_PATH=$ROOTSYS/lib:$DYLD_LIBRARY_PATH
export PATH=$ROOTSYS/bin:$PATH

HDF5: C++ interface

#export HDF5CPP=/usr/share/hdf5-c++
export HDF5CPP=/usr/bin/h5c++
#export DYLD_LIBRARY_PATH=$HDF5CPP/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=/usr/lib64/:$DYLD_LIBRARY_PATH

============================
[wht34@icarus hdf2root]$ make
Linking hdf2root.e
g++ -fPIC -pthread -m64 -I/usr/share/root/include -I/include -I. hdf2root.cc -o hdf2root.e -L. -L/usr/share/root/lib -lCore -lCint -lRIO -lNet -lHist -lGraf -lGraf3d -lGpad -lTree -lRint -lPostscript -lMatrix -lPhysics -lMathCore -lThread -pthread -lm -ldl -rdynamic -L/lib -lhdf5_cpp
/tmp/wht34/ccAALUJ7.o: In function main': hdf2root.cc:(.text+0x2834): undefined reference toH5check_version’
/usr/lib/…/lib64/libhdf5_cpp.so: undefined reference to `H5Tunregister’

Hello,

The compiler also has to know where the headers for HDF5 are.
If in the configuration script you define the environment variable HDF5CPP, the Makefile would define the path to the HDF5 headers in this way: $HDF5CPP/include.

Make sure that the file H5Cpp.h is located there.

Cheers,
Alberto