Storing TTree Values as 24 bit integers

I am using Root 5.34 with C++.

My experimental DAQ has 24 bits of precision, and currently stores the values on the hard disk as a signed 32 bit integer where the least 8 bits are used for error checking.

For calculations an eight bit shift is used, int a = (stored_value>>8), to remove those error checking bits.

For the first level of processed data it would take a third less space on disk if I could store 24 bit integers in my TTrees.

Is this possible to setup?

I do not think this is possible, the builtin integer types of C++ are all powers of 2 and I suppose this is true for ROOT as well although I tend to not use the ROOT builtin types.

Why not separate the 8 least bits into an 8 bit integer, and the upper 16 bits as an 16 bit integer? Granted this will still give you some overhead since you need to store two branches instead of one, but it might still reduce the size of your trees on disk.

ROOT trees can be compressed (see TFile::SetCompressionSettings). So my advice would be to not worry about this at all. You don’t save 1/3 of disk space on a compressed tree, especially when 8 bits are always set to zero.

Doing a comparision of the compression levels on a sub set of my data:

level | file size(mb) | procesing time
0 | 417 | ~10seconds- no compression
1 | 163 | ~20seconds
2 | 163 | ~20seconds
5 | 160 | noticibly but not significantly longer than 2
9 | 129 | 10-15minutes- maximum compression

So it looks like the default compression level of 1 is nearly as good as going all the way to 9, tolerably slower than 0, and knocks 60% off the file size.

So it looks like the built in compression is working well, and reading up on it, splitting things over more branches and leaves may decrease the achievable compression.

Of note off the start of the TTree->Print() is:
******************************************************************************
*Tree :Asym-r38314: Asym-r38314 *
*Entries : 12498 : Total = 436924908 bytes File Size = 170386008 *

  •    :          : Tree compression factor =   1.00                       *
    

******************************************************************************

“Total = 436924908” bytes is the uncompressed data size
"File Size = 170386008" is as it says, the size on disk
"Tree compression factor = 1.00 " is who knows what that led me to thinking the TTree was not getting compressed

This is done in Root 5.34, so Compression Factor could be more meaningful in later versions

Which compression algorithm did you test? Most of the time I am using level 3 with kLZMA, i.e. setting 203. For my datasets kLZMA compresses much better than kZLIB, so you might want to try it as well.

My apologies for the delay in responding.

I cannot find any information on how to change the compression algorithm.

What command do you use to change the compression algorithm?

a) to set the algo only: file->SetCompressionAlgorithm(algo);
b) to set both, algorithm and level: file->SetCompressionSettings(100*algo+level);

See https://root.cern.ch/doc/master/classTFile.html#aadd8e58e4d010c80b728bc909ac86760 for details.
Also see the enum ECompressionAlgorithm in Compression.h

I have not had time to experiment with the compression setting, and have to move on due to time constraints.

From the ECompressionAlgorithm in Compression.h there looks to be 4 options

enum ECompressionAlgorithm { kUseGlobalSetting,
kZLIB,
kLZMA,
kOldCompressionAlgo,
// if adding new algorithm types,
// keep this enum value last
kUndefinedCompressionAlgorithm
};

Using file->SetCompressionSettings(100*algo+level) as file->SetCompressionSettings(203) takes option number 2, kLZMA, from the enum list, and sets the compression level as 3 if I am reading it all correctly.

Thank you for your help.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.