Problem in accessing individual elements of an array branch through RDataFrame


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.26/00
Platform: Ubuntu 20.04
Compiler: gcc 10.3.0


Hi. I have a Chain, which has a branch holding array of Float_t (called AnodeCurrent). The size of array is fixed (1088). I created RDataFrame from this TChain, and I am trying to calculate Mean of each elements in array across rows. But it seems, even though RDataFrame can identify branch AnodeCurrent, it doesn’t recognize expressions like AnodeCurrent[0],AnodeCurrent[1] … and so on.

root [0] .L ../lib/libMDA.so 
root [1] .include ../include
root [2] DataSynchroniser dm
(DataSynchroniser &) Name:  Title: 
root [3] dm.SyncData("dataseton.txt","ON")
Warning in <DataSynchroniser::SyncData>: Invalid data type
Warning in <DataSynchroniser::SyncData>: Invalid data type
(int) 114
root [4] TChain *ch = dm.GetChain("Critical")
(TChain *) 0x55e790c237c0
root [5] auto df = ROOT::RDataFrame(*ch)
(ROOT::RDataFrame &) A data frame built on top of the ON_critical_Chain dataset.
root [6] df.Mean("Zenith").GetValue()
(const double) -16.586547
root [7] df.Mean("AnodeCurrent[0]").GetValue()
Error in <TRint::HandleTermInput()>: std::runtime_error caught: Unknown column: AnodeCurrent[0]

What’s happening ?

Posting the TChain.Print output in above case.

******************************************************************************
*Chain   :Oct11_critical_Chain: root:///home/chinmay/DataComparison2021/test_anamace/garuda/006574/Critical_020900.root *
******************************************************************************
******************************************************************************
*Tree    :CriticalTelemetryData: Critical Telemetry Parameters                          *
*Entries :     1324 : Total =        36671847 bytes  File  Size =    8266418 *
*        :          : Tree compression factor =   4.44                       *
******************************************************************************
*Br    0 :Event     : Event/I                                                *
*Entries :     1324 : Total  Size=       5870 bytes  File Size  =       1985 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=   2.71     *
*............................................................................*
*Br    1 :NChannel  : NChannel/I                                             *
*Entries :     1324 : Total  Size=       5885 bytes  File Size  =        156 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=  34.54     *
*............................................................................*
*Br    2 :MET       : MET/D                                                  *
*Entries :     1324 : Total  Size=      11164 bytes  File Size  =       2994 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=   3.57     *
*............................................................................*
*Br    3 :Zenith    : Zenith/D                                               *
*Entries :     1324 : Total  Size=      11179 bytes  File Size  =       9834 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=   1.09     *
*............................................................................*
*Br    4 :Azimuth   : Azimuth/D                                              *
*Entries :     1324 : Total  Size=      11184 bytes  File Size  =       9941 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=   1.07     *
*............................................................................*
*Br    5 :NModule   : NModule/I                                              *
*Entries :     1324 : Total  Size=       5880 bytes  File Size  =        154 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=  34.98     *
*............................................................................*
*Br    6 :Scalar_SLTG : Scalar_SLTG[8]/s                                     *
*Entries :     1324 : Total  Size=      21790 bytes  File Size  =       9147 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=   2.33     *
*............................................................................*
*Br    7 :status_DC : status_DC[4]/b                                         *
*Entries :     1324 : Total  Size=       5890 bytes  File Size  =        152 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=  35.45     *
*............................................................................*
*Br    8 :status_SLTG : status_SLTG[8]/b                                     *
*Entries :     1324 : Total  Size=      11196 bytes  File Size  =        177 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=  60.38     *
*............................................................................*
*Br    9 :status_TC : status_TC[5]/b                                         *
*Entries :     1324 : Total  Size=       7214 bytes  File Size  =        167 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=  40.20     *
*............................................................................*
*Br   10 :status_LC : status_LC[2]/b                                         *
*Entries :     1324 : Total  Size=       3242 bytes  File Size  =        139 *
*Baskets :        1 : Basket Size=      32000 bytes  Compression=  19.72     *
*............................................................................*
*Br   11 :CCR_MOD   : CCR_MOD[NChannel]/i                                    *
*Entries :     1324 : Total  Size=    5790363 bytes  File Size  =     460989 *
*Baskets :      190 : Basket Size=      32000 bytes  Compression=  12.55     *
*............................................................................*
*Br   12 :PCR_MOD   : PCR_MOD[NChannel]/i                                    *
*Entries :     1324 : Total  Size=    5790363 bytes  File Size  =     239340 *
*Baskets :      190 : Basket Size=      32000 bytes  Compression=  24.18     *
*............................................................................*
*Br   13 :Temp_Status : Temp_Status[NChannel]/s                              *
*Entries :     1324 : Total  Size=    2898433 bytes  File Size  =      40854 *
*Baskets :       95 : Basket Size=      32000 bytes  Compression=  70.89     *
*............................................................................*
*Br   14 :PS_Status : PS_Status[NChannel]/b                                  *
*Entries :     1324 : Total  Size=    1451792 bytes  File Size  =      23110 *
*Baskets :       46 : Basket Size=      32000 bytes  Compression=  62.76     *
*............................................................................*
*Br   15 :DRS_Runtime_error_Status : DRS_Runtime_Error_Status[NChannel][4]/b *
*Entries :     1324 : Total  Size=    5793668 bytes  File Size  =     414747 *
*Baskets :      190 : Basket Size=      32000 bytes  Compression=  13.96     *
*............................................................................*
*Br   16 :Dis_Pmt   : Dis_Pmt[NChannel]/s                                    *
*Entries :     1324 : Total  Size=    2898037 bytes  File Size  =     156630 *
*Baskets :       95 : Basket Size=      32000 bytes  Compression=  18.49     *
*............................................................................*
*Br   17 :AnodeCurrent : ACR[NChannel]/F                                     *
*Entries :     1324 : Total  Size=    5791313 bytes  File Size  =    2707103 *
*Baskets :      190 : Basket Size=      32000 bytes  Compression=   2.14     *
*............................................................................*
*Br   18 :SCR       : SCR[NChannel]/i                                        *
*Entries :     1324 : Total  Size=    5789587 bytes  File Size  =    4142410 *
*Baskets :      190 : Basket Size=      32000 bytes  Compression=   1.40     *
*............................................................................*
*Br   19 :Tau       : Tau[NModule]/F                                         *
*Entries :     1324 : Total  Size=     367200 bytes  File Size  =      34482 *
*Baskets :       12 : Basket Size=      32000 bytes  Compression=  10.63     *
*............................................................................*

You probably want: ACR[0]
Note also that this is a variable-size array so, for every tree entry, you should check: NChannel > 0

ACR[0] , ACR[1] … does not work either.
NChannel is fixed in all the rows.
Also RDataFrame.GetColumnNames() gives following output.

root [9] df.GetColumnNames()
(ROOT::RDF::ColumnNames_t) { "AnodeCurrent", "Azimuth", "CCR_MOD", "DRS_Runtime_error_Status", "Dis_Pmt", "Event", "MET", "NChannel", "NModule", "PCR_MOD", "PS_Status", "SCR", "Scalar_SLTG", "Tau", "Temp_Status", "Zenith", "status_DC", "status_LC", "status_SLTG", "status_TC" }

from which it seems AnodeCurrent branch is recognised by the RDataFrame.

You probably need ROOT 6.26/02 (when name and leaflist of a TBranch are different).

It isn’t working for other similar array branches which have same name and leaflist. (e.g. SCR,CCR_MOD etc.)
Also, these branches are created with statements of the form

	teltree -> Branch("CCR_MOD",ccr_mod,"CCR_MOD[NChannel]/F") ;

so there is no leaflist as such.

Checked with 6.26.02. Doesn’t work there either.

Hi @Chinmay ,
the mismatch is that "AnodeCurrent" is a column name, while "AnodeCurrent[0]" is an expression, and e.g. df.Mean(X) requires X to be a column name.

This should do what you want:

  std::vector<ROOT::RDF::RResultPtr<double>> means(1088);
  for (int i = 0; i < 1088; ++i) {
    means[i] = df.Define("x", [i](ROOT::RVecD &v) { return v[i]; }, {"AnodeCurrent"})
                 .Mean("x");
  }

  for (auto m : means)
    std::cout << m.GetValue() << '\n';

If the default Mean implementation is too numerically unstable, you can also use a Kahan sum, see e.g. ROOT: tutorials/dataframe/df022_useKahan.C File Reference .

Cheers,
Enrico

Hi @eguiraud.
Thanks for this solution.
Is there a way to template the snippet of code you gave ? e.g. If I want to use same function for calculation of mean of CCR_MOD branch elements. Now CCR_MOD is an integer array, the lambda
expression will have to be templated. But I think at least in C++17 that’s not possible.
Also, It seems there is no support for ‘unsigned char’ type in ROOT::RVec. How Can I use RVec for unsigned char type variables

If this is not super performance-critical code, you can use the generic:

  std::vector<ROOT::RDF::RResultPtr<double>> means(1088);
  std::string branchName = "AnodeCurrent";
  for (int i = 0; i < 1088; ++i) {
    means[i] = df.Define("x", branchName + "[" + std::to_string(i) + "]")
                 .Mean("x");
  }

  for (auto m : means)
    std::cout << m.GetValue() << '\n';

And you can have several branch names in a vector<string> and have an outer loop over those.

If you need a version tweaked for performance we can work that out too, it will just require a bit more characters :slight_smile:

Cheers,
Enrico

Cool… That works for me.
Thanks.
Is there support for unsigned char in RVecs ? i.e. If I have an array that holds UChar_t type, will it be read out properly ? If yes, what will be Column Type ? Web documentation (this one) lists all the basic data types like RVecF, RVecD, RVecI etc. except for UChar_t type.

It will be RVec<UChar_t>

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.