Trouble reading RVec of vector from branch with TClonesArray of objects with std::vector member

I have root file and I need to process data from it. I have tried three approaches, please see source code below. One of them with TTreeReader works, but I want to use RDataFrame because of parallel processing benefits. However, it seems that RDataFrame can’t read RVec<vector<double> > constructed from TClonesArray of objects with member of type vector<double>.
Here is minimal reproducer.

using namespace std;

// very simplified classes from real project
// I can't change source code of it without huge refactoring
class Event : public TObject
	TClonesArray tracks;

	Event() : tracks{"Track", 5} {};

	ClassDef(Event, 1);

class Track : public TObject
	vector<double> hitEnergies;

	Track(const vector<double>& v = {}) : hitEnergies{v} {};

	ClassDef(Track, 1);

// fill tree with events with random amount of tracks and random amount of hits
void write() {
	TFile *f1 = new TFile(TString("events.root"), "recreate");
	TTree *t1 = new TTree("Events", "events");

	Event *event = new Event();
	t1->Branch("Event.", "", event);

	int nentries = 4;
	for (int i = 0; i < nentries; ++i) {
		TClonesArray *tracks = &(event->tracks);

		int nTracks = gRandom->Integer(10) + 1;

		for (int j = 0; j < nTracks; ++j) {
			Track *track = (Track*)tracks->ConstructedAt(tracks->GetEntries());

			int n = gRandom->Integer(10) + 1;

			for (int k = 0; k < n; ++k) {

// this method works, but slow for large amount of data
void read1() {
	TChain *events = new TChain("Events");
	// events->Print();

	TTreeReader reader(events);
	TTreeReaderValue<Event> event(reader, "Event."); 

	cout << "TTreeReaderValue<Event> says" << endl;
	while (reader.Next()) {
		cout << "Tracks: " << event->tracks.GetEntries() << endl; // OK, can extract data further

// this method I tried first chronologically, but it doesn't work
void read2() {
	TChain *events = new TChain("Events");
	// events->Print();

	TTreeReader reader(events);
	TTreeReaderArray<vector<double> > energies(reader, "Event.tracks.hitEnergies");

	cout << "TTreeReaderArray<vector<double> > says" << endl;
	while (reader.Next()) {
		cout << "Tracks: " << energies.GetSize() << endl; // Zeros! Extracting data leads to segfaults

// this method I want to use to speed up calculations
void read3() {
	ROOT::RDataFrame d("Events", "events.root");

	// d.Describe().Print();
	// cout << endl;

	cout << "RDataFrame says" << endl;
	for (const auto &el : d.Take<ROOT::RVec<vector<double> > >("Event.tracks.hitEnergies")) {
		cout << "Tracks: " << el.size() << endl; // Zeros! Extracting data leads to segfaults

void example() {

	cout << endl;
	cout << endl;

To sum up,

  1. Main trouble. RDataFrame fails to read values from RVec<vector<double> > because of segfaults. Investigation showed that RVec has zero size. Is there any way to fix this?
  2. Not so important trouble. TTreeReaderArray<vector<double> > seems to have the same problem, but I won’t use it probably.

ROOT Version: 6.26/06
Platform: CentOS Linux release 7.9.2009 (Core) x86_64
Compiler: gcc (GCC) 12.2.0

Here is even more minimal reproducer.

// test.cpp

class Track : public TObject
	vector<double> hitEnergies;

	Track(const vector<double>& v = {}) : hitEnergies{v} {};

	ClassDef(Track, 1);

void test() {
	TTree tree("Events", "events");

	TClonesArray arr("Track", 1);

	tree.Branch("Tracks.", &arr);

	((Track*)arr.At(0))->hitEnergies.assign({1.0, 2.0, 3.0});
	((Track*)arr.At(1))->hitEnergies.assign({4.0, 5.0});

	// tree.Print();

	tree.DrawClone("Tracks.hitEnergies"); // data exist

	ROOT::RDataFrame d(tree);
	// d.Describe().Print();
	// cout << endl;

	for (const auto &el : d.Take<ROOT::RVec<vector<double> > >("Tracks.hitEnergies")) {
		cout << << endl; // should be "4.0", but produces out of bounds error instead

Tree is filled successfully

but trying to access data via RDataFrame fails with

terminate called after throwing an instance of 'std::out_of_range'
  what():  RVecN

Is that ROOT bug?

Hi @Ako_b,

This probably needs @eguiraud to reply / investigate what is going on. Let’s ping him.


Hi @Ako_b ,

and welcome to the ROOT forum!

RDataFrame uses TTreeReader under the hood, so the issue with zero-sized RVecs you see in RDataFrame is probably a consequence of the zero-sized arrays returned by TTreeReaderArray.

I’m taking a look!

Alright, there are a few things going on here mostly related to TTreeReader and ROOT I/O that make some things work and some things not with RDataFrame.

  1. the problem with TTreeReaderArray<vector<double>>(reader, "Event.tracks.hitEnergies") is a bug in TTreeReader, I opened an issue. RDataFrame will use TTreeReaderArray under the hood whenever you read a colum as RVec, hitting this issue. RDataFrame also reads columns as RVecs automatically if they are arrays, which is what happens in your reproducer

  2. the workaround would be d.Take<Track>("Tracks."), but there is a hiccup: Take needs to read the TClonesArray for every event and copy them into the resulting vector<Track>. However, a TClonesArray of Track objects is not copiable if you run the program as an interpreted macro, because the default copy-constructor of a TObject invokes Clone and the default Clone implementation requires ROOT I/O dictionaries for the Track class. Running the macro as root -l -b -q test.C+ (with the +) generates dictionaries for the Track class before running so this works:

#include <ROOT/RDataFrame.hxx>
#include <ROOT/RVec.hxx>
#include <TClonesArray.h>
#include <TObject.h>
#include <TTree.h>
#include <iostream>
#include <vector>

class Track : public TObject {
  std::vector<double> hitEnergies;

  Track(const std::vector<double> &v = {}) : hitEnergies{v} {};

  ClassDef(Track, 1);

void works() {
  // write file
    TFile f("f.root", "recreate");
    TTree tree("Events", "events");

    TClonesArray arr("Track", 1);

    tree.Branch("Tracks", &arr);

    ((Track *)arr.At(0))->hitEnergies.assign({1.0, 2.0, 3.0});
    ((Track *)arr.At(1))->hitEnergies.assign({4.0, 5.0});


  // this involves a copy of the TClonesArray objects, but copying a
  // TClonesArray of an interpreted class does not work (at least not if `Track`
  // uses the default `Clone` method, overriding it might fix this issue).
  ROOT::RDataFrame d("Events", "f.root");
  std::vector<TClonesArray> arrs = d.Take<TClonesArray>("Tracks").GetValue();
  std::cout << arrs.size() << '\n';
  std::cout << static_cast<Track *>(arrs[0][0])->hitEnergies[1] << '\n';
  std::cout << static_cast<Track *>(arrs[0][1])->hitEnergies[0] << '\n';

to be run e.g. as root -l -b -q works.cpp+ (note the +).

Processing the TClonesArrays on the fly without copying them out also works, even without dictionaries, e.g.:

ROOT::RDataFrame d("Events", "f.root");
    [](const TClonesArray &arr) {
      std::cout << static_cast<Track *>(arr[0])-> << '\n';
      std::cout << static_cast<Track *>(arr[1])-> << '\n';

Overriding the Clone method id Track could also be a workaround for problem number 2.

I hope this helps!

for more information on what dictionaries are and how to generate them for your classes see I/O of custom classes - ROOT

Thank you, jalopezg and eguiraud for your replies!

With this approach it works completely fine. I greatly appreciate your help!