I’ve discovered recently (and the hard way) that the zipping methods provided by the ROOT framework are limited to data sizes lower than 16MB. I’m talking here of the methods R__zip and R__unzip which are found in root/RZip.h at master · root-project/root · GitHub and their practical implementations in e.g. root/ZipZSTD.cxx at master · root-project/root · GitHub or root/ZipLZMA.c at master · root-project/root · GitHub
There are mainly 2 issues here. The first one was a missing check that the maximum size is respected in the ZSTD case, and that’s handled via Lack of size validation in ZSTD compression · Issue #9334 · root-project/root · GitHub
I would like to discuss here the other issue : why such a limitation ? And can we remove it ? Indeed, the size is passed to these methods as an int, so in principle we can handle up to 2-4GB (should be unsigned to allow 4). But the limitation comes from the format of the byte stream generated and the fact that the header is 9 bytes, namely 3 magic bytes (ZS\1) and 2 sizes (original and zip) on 3 bytes each.
What about having a new magic sequence (ZS\2) and an 11 bytes headers with sizes on 4 bytes each ? Does that sound feasible ?