Exploring RNTuple network streaming via Apache Arrow Flight

Hi everyone,

I am working around low-level internals of the new C++ API and am looking for some advice on memory management.

  • ROOT Version: 6.36.00
  • Platform: Ubuntu 24.04 (Docker)
  • Compiler: GCC / C++17

I have been building a C++ prototype to stream RNTuple columnar data natively as Apache Arrow tables over an Arrow Flight (gRPC) server. The idea is to stream data to Python/remote clients without requiring local file downloads.

The C++ streaming is working decent for initial state (I measured < 1.85x overhead compared to a raw RNTuple loop), but I want to optimize the memory handoff.

My Question:
Currently, my overhead comes from doing a manual memcpy of the values from RNTuple into Arrow’s pre-allocated memory buffers.

Does the RNTupleReader API expose a safe way to get a raw pointer to the underlying uncompressed memory page for a specific column?

I would prefer to “alias” or “borrow” that memory directly into Apache Arrow to achieve true zero-copy, but I am not sure if that memory is strictly hidden behind the REntry / model layer.

Any tips, or pointing me to the right class in the source code, would be hugely appreciated. Thank you!

Reference:
Github - KaranSinghDev/RNTuple-Arrow-Gateway

Hello Karan,

Welcome to the ROOT Forum!

I am adding @jblomer in the loop.

Best,
Danilo

Thank you, let me know if any additional information is required. I am still working on this so I will update the message if make any decent progress on it.

That is very interesting, thank you for reaching out to us!

For individual elements of simple types, the RNTupleDirectAccessView gives a reference into the page buffer.

In general, however, there is no public API to access the page buffers directly. Even with such an API, memory copies would be necessary where the target buffer does not align with the RNTuple page boundaries. The page boundaries are also not aligned with entry boundaries, so when reading more than one column, the number of elements that could possibly be accessed without an additional copy would probably be small.

Bulk reading may help (RNTupleReader::CreateBulk()). That involves a memory copy from the page buffer but it copies in bulk and not value by value. You can bind an existing buffer into which you bulk read, which can be the one provided by Arrow.