RDataFrame column types reading in a csv file

ROOT Version: 6.14.04
Platform: MacOSX
Compiler: Not Provided

I am using ROOT::RDF::MakeCsvDataFrame to create a DataFrame from a
csv file. Afterwards I would like to filter on a particular column using a string
operation. However, how can I make sure that that column is a string ?

Hi Eddy,
you can check what type a certain column is inferred to be with df.GetColumnType("column_name"). If that returns "std::string" I guess you are okay.

If not, you have two options: either you read the value as whatever type RDataFrame inferred your column to be, and then you do a std::to_string to convert it to a std::string, or you edit the very first line of you CSV file to e.g. put the column value between quotes to make it clear that you want that column to be treated as a string (RDataFrame decides what types your columns are by looking at the first row).

Currently we do not support telling RDataFrame what type a CSV column is.

Hi Enrico,

Thanks, that helps. I guessed that RDataFrame decides about the types by looking at the first line.

In my case the zip code was numeric in the first row but had later some ‘xxxx’ entries. Quoting the header line solved indeed the issue by forcing this field to be interpreted as std::string.


Are you aware that then the column name contains then also the quotes ?

Hi Eddy,
uhm I’m not sure what you mean.

Given the following CSV file:


I seem to get reasonable results (even if I quote "name1", "name2" or "name3"):

root [0] auto df = ROOT::RDF::MakeCsvDataFrame("asd.csv")
root [1] df.GetColumnNames()
(ROOT::RDF::RInterface::ColumnNames_t) { "name1", "name2", "name3" }
root [2] df.GetColumnNames()[1][0]
(char) 'n' // not '"'
root [3] df.GetColumnType("name2")
(std::string) "std::string"

Let us know if you see anything unexpected.

Hi Enrico,

I agree, my mistake. I used ‘name’ instead of “name”.

Indeed the result is a std::string but the column name is now ‘name’ .


Good! I’ll mark this as solved then :slight_smile:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.