I am interested in sharing the field addresses (std::shared_ptr<T>) used by RNTupleReader across multiple instances of RNTupleReader. This would enable the framework I’m developing to read multiple files in sequence without clunky callbacks for the processors looking at the addresses to update which address they are looking at.
Context: I am working on updaing the LDMX-Software/ldmx-sw Framework to use RNTuple instead of TTree (mock-up framework on Codeberg) and one feature that I would like to carry forward is the ability to read multiple input files while processing. The current TTree Framework does this by maintaining control of all of the branch addresses itself. I would like to avoid implementing this type-erasure myself since it is already implemented in RNTupleModel/REntry.
ROOT Version: 6.38.00
Platform: linuxx8664gcc
Compiler: c++ (Ubuntu 15.2.0-4ubuntu4) 15.2.0 std202302
My naive implementation where I manage an REntry and RNTupleModel fails because when I Clone the model to give it to the RNTupleReader and LoadEntry into my REntry created by the original model, the fModelId does not match (although I suspect the fSchemaId would).
auto reading_model = ROOT::RNTupleModel::CreateBare();
{
std::cout << "spy on first file to get model" << std::endl;
ROOT::RNTupleDescriptor::RCreateModelOptions create_model_options;
create_model_options.SetCreateBare(true);
auto reader = ROOT::RNTupleReader::Open(create_model_options, TUPLENAME, FILEPATHS[0]);
const auto& desc = reader->GetDescriptor();
for (const auto& field_desc : desc.GetTopLevelFields()) {
reading_model->AddField(field_desc.CreateField(desc));
}
}
std::cout << "freeze model and create full-run entry" << std::endl;
reading_model->Freeze();
auto entry = reading_model->CreateEntry();
auto ifile_ptr = entry->GetPtr<int>("ifile");
auto value = entry->GetPtr<int>("value");
std::cout << "read both files in sequence" << std::endl;
for (int ifile{0}; ifile < FILEPATHS.size(); ifile++) {
auto reader = ROOT::RNTupleReader::Open(reading_model->Clone(), TUPLENAME, FILEPATHS[ifile]);
for (auto ientry : *reader) {
reader->LoadEntry(ientry, *entry);
std::cout << *ifile_ptr << " -> " << *value << std::endl;
}
}
produces
writing
spy on first file to get model
freeze model and create full-run entry
read both files in sequence
terminate called after throwing an instance of 'ROOT::RException'
what(): mismatch between entry and model
At:
void ROOT::RNTupleReader::LoadEntry(ROOT::NTupleSize_t, ROOT::REntry&) [/opt/root/include/ROOT/RNTupleReader.hxx:241]
Aborted (core dumped)
where the exception is orignating from comparing the model IDs and not the schema IDs.
Is this a bug in the reading implementation? I feel like I should (hypothetically) be able to have multiple copies of a model all with the same schema and since they have the same schema they should be able to LoadEntry into the same REntry but I may be misunderstanding what the “schema ID” represents.
I am aware of RNTupleProcessor::CreateChain which I would like to avoid for two main reasons.
- It is still experimental.
- My current Framework enables running the same processors with or without an input file to read by using a transient
RNTupleModelto host the fields that are being processed. Being able toBindValue<void>the reader’s model addresses to the transient model’s addresses is what allows reading input RNTuples to plug seamlessly into the structure.