How to explore RNTuples in the interpreter

Dear ROOT community,

I’ve been experimenting with RNTuples and, while they are generally a net upgrade over TTrees, I have not found as nice a way to explore the data in the ROOT interpreter.

Whereas TTree has Print, Draw, Scan… The best solution I’ve found for RNTuple was to place them in a RDataFrame and then do operations (Describe, Define, Filter, Foreach, Histo1D…) over them. It works but it really isn’t as practical as the TTree functions. The Histo1D in particular is quite unpractical as it requires defining the histogram model which means having already some idea as to the data range.

Am I missing something? Is there a better way to explore RNTuple data?

Best wishes,

*EK

Please read tips for efficient and successful posting and posting code*

Please fill also the fields below. Note that root -b -q will tell you this info, and starting from 6.28/06 upwards, you can call .forum bug from the ROOT prompt to pre-populate a topic.

ROOT Version: 6.38.02
Platform: x86_64 Linux 6.12.74-1-lts
Compiler: Not Provided


Hello @EKEK,

for the very start, to get some first impression, you can use the RNTupleReader, and specifically Show() and PrintInfo().

If you want more details about e.g. the achieved compression, there is RNTupleInspector.

For plotting, we would indeed recommend RDataFrame. You don’t actually need to know the data range, and can use auto-binning similar to how it’s done with TTree. There is a table of command examples in the RDataFrame documentation, the Rosetta Stone. In your case, the first example might already be useful:

auto *tree = file->Get<TTree>("myTree");
tree->Draw("x", "y > 2");

becomes:

ROOT::RDataFrame df("myNTuple", file);
df.Filter("y > 2").Histo1D("x")->Draw();

When you don’t specify any HistoModel, it falls back to auto-binning. Concretely, it goes to this overload of Histo1D().

Thanks for the answer!

So I have to define an extra object (RNTupleInspector or RDataFrame) to investigate the RNTuple correct? There is no way to simply Draw like for TTree?
Perhaps I missed it but is there an alternative for TTree::Scan?

Hello,

you are correct. Since RDataFrame is supposed to read both TTree and RNTuple, it made sense to put the reading/drawing/scanning functionality only in one place instead of implementing it again for RNTuple.

The TTree::Scan equivalent in RDataFrame is Display(). What might feel new for Display is that it’s not an immediate action in RDataFrame (so you could run it in parallel with e.g. filling histograms). For that reason, Display returns an RResultPtr that you have to dereference like so:

root [1] ROOT::RDataFrame df("Events", "root://eospublic.cern.ch//eos/opendata/cms/derived-data/AOD2NanoAODOutreachTool/Run2012BC_DoubleMuParked_Muons.root");
root [2] df.Display(".*")->Print()
+-----+-------------+-----------+-----------+-----------+-----------+-------+
| Row | Muon_charge | Muon_eta  | Muon_mass | Muon_phi  | Muon_pt   | nMuon | 
+-----+-------------+-----------+-----------+-----------+-----------+-------+
| 0   | -1          | 1.066827  | 0.105658  | -0.034273 | 10.763697 | 2     | 
|     | -1          | -0.563787 | 0.105658  | 2.542615  | 15.736523 |       | 
+-----+-------------+-----------+-----------+-----------+-----------+-------+
| 1   | 1           | -0.427780 | 0.105658  | -0.274792 | 10.538490 | 2     | 
|     | -1          | 0.349225  | 0.105658  | 2.539781  | 16.327097 |       | 
+-----+-------------+-----------+-----------+-----------+-----------+-------+
| 2   | 1           | 2.210855  | 0.105658  | -1.223414 | 3.275326  | 1     | 
+-----+-------------+-----------+-----------+-----------+-----------+-------+
| 3   | 1           | -1.588240 | 0.105658  | -2.077304 | 11.429154 | 4     | 
|     | 1           | -1.751184 | 0.105658  | 0.251358  | 17.634033 |       | 
|     | 1           | -1.590997 | 0.105658  | -2.013049 | 9.624728  |       | 
|     | 1           | -1.655963 | 0.105658  | -1.849973 | 3.502225  |       | 
+-----+-------------+-----------+-----------+-----------+-----------+-------+
| 4   | -1          | -2.172484 | 0.105658  | -2.370008 | 3.283442  | 4     | 
|     | -1          | -2.182535 | 0.105658  | -2.305139 | 3.644006  |       | 
|     | 1           | -1.123363 | 0.105658  | -0.975242 | 32.911224 |       | 
|     | 1           | -1.162901 | 0.105658  | -0.773005 | 23.721754 |       | 
+-----+-------------+-----------+-----------+-----------+-----------+-------+

You can select the columns either with a vector of strings or with a regular expression like I did above.

Thanks a lot for the answers!

If I may make a feature request, it would be to create some simpler interface for RNTuple exploration though.

OK, thanks for the input! I’ll raise it with the RNTuple experts. :+1:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.