Reading only part of an array from a hard drive

LeWhoo · January 17, 2023, 3:47pm

Hello,

I wonder if ROOT has any functionality to partially read a multidimensional array from a file. Let’s say we have a 4D array and we need only a small part of it, so we request a slice array[x1:x2][y1:y2] and only this slice is read into memory. This is possible in Python with numpy files and numpy.load mmap_mode function, and can accelerate input significantly for very large arrays stored on HD. I could not find anything similar in ROOT, but since ROOT often leads in I/O performance, I decided to ask

jalopezg · January 18, 2023, 12:31pm

Hi @LeWhoo,

I think that ROOT I/O does not allow that out-of-the-box, but I might be well wrong (@pcanal might correct me).
I think that the desired behavior should be (somehow) possible to implement on a custom streamer – but, again, I think @pcanal can guide better here.

Cheers,
J.

LeWhoo · January 18, 2023, 1:07pm

Thanks. Probably not much need for it in the ROOT community then. I am asking out of curiosity - the numpy solution works fine for me and I don’t think I would find enough time and have enough knowledge to implement my own streamer for that

pcanal · January 18, 2023, 5:23pm

I guess that to achieve this you would need to preprare the input (i.e. at write time) for the subset of slicing you expect by splitting the array in distinct branches and or entries (and to a large extent RNTuple on file format would more friendly to this kind of access).

system · February 1, 2023, 5:23pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.