Draw short integers as numbers, not characters

Hello,

I was trying to store integer numbers for which I know the range is limited as int8_t or uint8_t in TTrees (using the b or B branch type qualifiers) (I assumed compression should be fairly efficient in taking care of it ,but I still nocited that this resulted in a file size reduction).

Unfortunately, this completely messes up the drawing of the branches in a TBrowser (both the new web-based and the legacy one):

image

instead of:

image

This is not the end of the world, but it’s still quite annoying for quick checks of the file content…

Is there a way to specify that I want those to be drawn as integers and not characters?

Thanks,
Sébastien

ROOT Version: 6.26/00
Platform: all
Compiler: any


Hi @swertz,

Thanks for reporting this here. I guess it also happens if you manually create and fill a TH1?
@linev is this a know bug? Otherwise, I’ll open a new issue on GitHub.

Cheers,
J.

No, this works fine:

root [2] auto test = TH1F("test", "test", 20, -1, 19);
root [3] Events->Draw("Electron_jetIdx>>test")

Hi,

Do you get expected results when calling:

Events->Draw("Electron_jetIdx")

Apologies, I had the wrong file opened…

Both

root [2] auto test = TH1F("test", "test", 20, -1, 19);
root [3] Events->Draw("Electron_jetIdx>>test")

and

Events->Draw("Electron_jetIdx")

give the same (bad) result.

I guess it is question for @pcanal how to correctly invoke TTree::Draw() to get expected results with int8_t data. Seems to be, it tries to use it as const char *.

Probably related: ROOT 5 created TTree read with ROOT6 looks broken for int8_t · Issue #7565 · root-project/root · GitHub

In this case I was writing using ROOT 6.24/07 and reading with 6.26/00.

It seems that the cite by @yus does not refer to the same issue.
That said, I was unable to reproduce the initial problem that you reported neither in ROOT master nor v6-26-00-patches, i.e. 6.26/11.

@swertz Could you please share a minimal reproducer?
@pcanal Does this ring a bell or anything?

Cheers,
J.

I’ve tried a few things in 6.26/00…:

This works fine with both int8_t and uint8_t:

void test1() {
    auto f = TFile::Open("test1.root", "recreate");
    TTree t("Events", "");
    int8_t var;
    t.Branch("var", &var, "var/B");
    for (size_t i=0; i<10; i++) {
        var = i;
        t.Fill();
    }
    f->Write();
    f->Close();
}

This however, which is how I encountered the issue orginally, only works with uint8_t. Using int8_t results in the weird formatting:

void test() {
    auto f = TFile::Open("test2.root", "recreate");
    TTree t("Events", "");

    //typedef uint8_t T;  // this works fine
    typedef int8_t T;  // this doesn't

    std::string typeQual;
    if constexpr (std::is_same<T, uint8_t>())
        typeQual = "b";
    else if constexpr (std::is_same<T, int8_t>())
        typeQual = "B";
    else if constexpr (std::is_same<T, int>())
        typeQual = "I";

    unsigned int counter;
    std::vector<T> Jet_idx;

    t.Branch("nJet", &counter, "nJet/i");
    auto br = t.Branch("Jet_idx", (void*)nullptr, ("Jet_idx[nJet]/" + typeQual).c_str());

    for (size_t i=0; i<10; i++) {
        Jet_idx = std::vector<T>(i, i);
        br->SetAddress(const_cast<T*>(&Jet_idx.front()));
        counter = Jet_idx.size();
        t.Fill();
    }

    f->Write();
    f->Close();
}

I’m attaching the resulting file: test2.root (5.6 KB)

And to read the second tree:

void read() {
    auto f = TFile::Open("test2.root");
    TTreeReader reader("Events", f);
    TTreeReaderArray<int8_t> array(reader, "Jet_idx");
    while (reader.Next()) {
        cout << "Event " << reader.GetCurrentEntry() <<  endl;
        auto size = array.GetSize();
        cout << "Size: " << size << " - data: ";
        for (size_t i=0 ; i < size; i++) {
            auto val = array[i];
            cout << "raw: " << val << ", cast: " << static_cast<int>(val) << " -- ";
        }
        cout << endl;
    }
}

Which gives:

Error in <TTreeReaderArrayBase::CreateContentProxy()>: The branch Jet_idx contains data of type char. It cannot be accessed by a TTreeReaderArray<signed char>
Event 0
Size: 0 - data: 
Event 1
Size: 1 - data: raw: , cast: 1 -- 
Event 2
Size: 2 - data: raw: , cast: 2 -- raw: , cast: 2 --
...

The error is strange! This happens when I create the TTree using int8_t and /B!

When both writing and reading with uint8_t and /b I get no error:

Event 0
Size: 0 - data: 
Event 1
Size: 1 - data: raw: , cast: 1 -- 
Event 2
Size: 2 - data: raw: , cast: 2 -- raw: , cast: 2 --
...

Hello, any update on this?

I guess @pcanal may help.

The /B for historical reason is handled as a C-style string (eg const char *). For example tweaking the code above with:

        Jet_idx = std::vector<T>(i, 96+i);

you get the output:

root [1] Events->Scan("","","")
************************************
*    Row   * nJet.nJet * Jet_idx.J *
************************************
*        0 *         0 *           *
*        1 *         1 *         a *
*        2 *         2 *        bb *
*        3 *         3 *       ccc *
*        4 *         4 *      dddd *
*        5 *         5 *     eeeee *
*        6 *         6 *    ffffff *
*        7 *         7 *   ggggggg *
*        8 *         8 *  hhhhhhhh *
*        9 *         9 * iiiiiiiii *
************************************
(long long) 10

(This explains the histogram in the first post)

On the other hand, I can reproduce the problem with TTreeReaderArray which should have worked.

@pcanal Historical description → TTree

            - C : a character string terminated by the 0 character
            - B : an 8 bit signed integer (Char_t)
            - b : an 8 bit unsigned integer (UChar_t)

Yes, this is accurate. In addition, TTree::Draw and TTree::Scan treats an array of /B as a string.

Thanks @pcanal . Is there any way to get Draw or Scan to treat them as integers instead? It seems to me that that use case would be far more common than using it to store strings…

Yes. You can involve the value in a spurious arithmetic operation (i.e. +0):

root [1] Events->Scan("Jet_idx+0","","")
***********************************
*    Row   * Instance * Jet_idx+0 *
***********************************
*        0 *        0 *           *
*        1 *        0 *        97 *
*        2 *        0 *        98 *
*        2 *        1 *       110 *
*        3 *        0 *        99 *
*        3 *        1 *       110 *
*        3 *        2 *        99 *

The TTreeReader error message is spurrious and seems to still lead to correct reading, isn’t it (the spurrious error will be removed shortly, see Spurrious error message when reading a `char` from a `TTreeReader<signed char>` · Issue #11837 · root-project/root · GitHub to follow the resolution).

Thanks, that’s a useful trick.

However this will not work when clicking on a branch when inspecting a file in a TBrowser… Why can’t int8_t be interpreted as a number by default? I don’t see why anyone in HEP would want to make a histogram of characters, whereas storing small integers as int8_t seems like a common use case.