Reading RVec<Rvec<>> in root prompt

jipark · January 25, 2023, 5:03am

Hello,

I’m trying to use RDF for my analysis. Most of features are fine, but I faced strange error when trying to read branch such as
ROOT::VecOps::RVec<ROOT::VecOps::RVec>
and
ROOT::VecOps::RVec<ROOT::VecOps::RVec>

resulting in
Error in `/opt/ohpc/pub/root/root_v6.26.06_gcc830/bin/root.exe’: realloc(): invalid old size: 0x0000000004964d20 ***
or
munmap_chunk(): invalid pointer

or similar memory errors.

The Linkdef.h is like

#ifndef LINKDEF_H_
#define LINKDEF_H_

#pragma link C++ class std::vector<std::vector<float>>+;
#pragma link C++ class ROOT::VecOps::RVec<float>+;
#pragma link C++ class ROOT::VecOps::RVec<ROOT::VecOps::RVec<float>>+;
#pragma link C++ class ROOT::VecOps::RVec<double>+;
#pragma link C++ class ROOT::VecOps::RVec<ROOT::VecOps::RVec<double>>+;

#endif /* LINKDEF_H_ */

Adding gROOT->ProcessLine(“#include "src/Linkdef.h"”); before opening the file didn’t work.

This is very strange because analysis code using RDF could read those branch without a problem and could write another branch from there.
Also I can read the brach using uproot( 4.3.7) tree.arrays().
But GetEntry() for the tree was not possible with pyroot, there were memory error or segmentation fault.
As I can work with RVec<RVec<>> with several different methods, I suspect root command line functions using internal loop.

The example root file is uploaded at my cern web page:
https://jiwonpark.web.cern.ch/temp/test/280000_0C76D3EA-6565-B24B-B716-B9BB61B55D59.root

Thanks in advance
J

_ROOT Version: 6.26
_Platform: centos 7.9
_Compiler: root is compiled with gcc 8.3.0

couet · January 25, 2023, 6:25am

Welcome to the ROOT forum,

May be @vpadulan can help.

eguiraud · January 30, 2023, 5:36pm

Hi @jipark ,

and welcome to the ROOT forum. How was the file produced and with what version of ROOT?

Cheers,
Enrico

P.S.
it looks like at the time of writing the file ROOT knew how to handle the branch type RVec<RVec<float>, so it’s weird it cannot read it back if both operations are happening in the same environment.

eguiraud · January 30, 2023, 8:45pm

Hello again

As far as I can tell, the problem is indeed the missing dictionary for RVec<RVec<float>>: I can make things work with the dictionary.

What I tried:

// LinkDef.h
#ifdef __CLING__
#pragma link C++ class ROOT::VecOps::RVec<ROOT::VecOps::RVec<float>>;
#endif

// foo.h
#include <ROOT/RVec.hxx>

// foo.cpp
#include <ROOT/RDataFrame.hxx>
#include <TApplication.h>
#include <ROOT/RVec.hxx>

int main() {
  TApplication app("app", nullptr, nullptr);
  ROOT::RDataFrame df("Events",
                      "280000_0C76D3EA-6565-B24B-B716-B9BB61B55D59.root");
  auto h = df.Define("s",
             [](const ROOT::RVec<ROOT::RVec<float>> &v) { return v.size(); },
             {"Tau_pt_unc"})
    .Histo1D("s");
   h->Draw();
   app.Run();
}

and then at the prompt:

$ root -l --version
ROOT Version: 6.26/10
Built for linuxx8664gcc on Nov 17 2022, 16:18:00
$ rootcling foo_dict.cpp foo.h LinkDef.h
$ g++ -o foo foo.cpp foo_dict.cpp $(root-config --libs --cflags)
$ ./foo # seems to work

Here are some extra instructions on how to generate dictionaries.

Note that ROOT should not crash without a useful error message in this case, it should actually report about the missing dictionary – we are looking into fixing the diagnostic for this case.

Cheers,
Enrico

eguiraud · January 30, 2023, 8:48pm

An even simpler version of a working program that reads the "Tau_pt_unc" branch from the file correctly:

// foo.C
#include <ROOT/RDataFrame.hxx>
#include <TApplication.h>
#include <ROOT/RVec.hxx>

#ifdef __CLING__
#pragma link C++ class ROOT::VecOps::RVec<ROOT::VecOps::RVec<float>>;
#endif

void foo() {
  ROOT::RDataFrame df("Events",
                      "280000_0C76D3EA-6565-B24B-B716-B9BB61B55D59.root");
  auto h = df.Define("s",
             [](const ROOT::RVec<ROOT::RVec<float>> &v) { return v.size(); },
             {"Tau_pt_unc"})
    .Histo1D("s");
   h->DrawClone();
}

which you can run e.g. as:

$ root -l -b -q foo.C+

jipark · January 31, 2023, 3:19am

Hello Enrico,

Many thanks to digging into this problem.
In the end, is it true that the macro should be compiled with the proper Linkdef.h and run? Currently the analysis framework runs in such way - but in that case it is not possible to check branch contents via root prompt (e.g. Events->Scan(“Tau_pt_unc”)).

In python, using following script, I can read branches directly without Linkdef:

import uproot
import pandas as pd
import numpy as np

inputvars = ["Tau_pt_unc"]
infile = uproot.open(some_input_file)
tree = infile["Events"]
pd_data = tree.arrays(inputvars,library="pd")

with pd.option_context('display.max_rows', None, 'display.max_columns', None, 'display.max_colwidth', None):
    print(pd_data)

I never tested with std::vector<std::vector> or similar, but this is quite common form to store per-physics-object scale factors and their uncertainties. Thus it would be great if the ROOT can interpret several RVec from the native prompt…

Best,
Jiwon

eguiraud · January 31, 2023, 3:23pm

It can, but you have to load the dictionaries (and again the fact that ROOT doesn’t tell you that is a bug, @pcanal is looking into it). This works for me:

import ROOT

ROOT.gInterpreter.GenerateDictionary("ROOT::RVec<ROOT::RVec<float>>", "ROOT/RVec.hxx")

df = ROOT.RDataFrame("Events", "../280000_0C76D3EA-6565-B24B-B716-B9BB61B55D59.root")
h = df.Define("s", "Tau_pt_unc.size()")\
      .Histo1D("s")
h.DrawClone()

It’s a bit slow to call GenerateDictionary every time – after the first time, when you have the dictionaries, you can substitute the call to GenerateDictionary with ROOT.gInterpreter.Load("AutoDict_ROOT__RVec_ROOT__RVec_float___cxx.so").

Cheers,
Enrico

system · February 14, 2023, 3:23pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.