How is CheckBranchAddressType supposed to work when the type is a vector?

I’m having some issues creating pointers to branches that contain vector data. The problem occured inside a custom RDataSource, but I can replicate it fine without any elements of RDataFrame, so it’s just an issue of my understanding of SetBranchAddress and CheckBranchAddressType

For example, say my TTree has a branch called “nJets” and a branch called “JetIsGood”. The “JetIsGood” branch contains a ROOT::VecOps::RVec<Bool_t> the same length as “nJets”. I’m trying to read “JetIsGood” but first I want to call CheckBranchAddressType (certain constructors of SetBranchAddress do this automatically). This might go like;

std::string colName = "JetIsGood";
TBranch *branch = mytree->GetBranch(colName.c_str());

std::string type_name = "ROOT::VecOps::RVec<Bool_t>";
auto ptrClass = TClass::GetClass( type_name.c_str() );

EDataType datatype = EDataType(0);
bool isptr = true;

mytree->CheckBranchAddressType(branch, ptrClass, datatype, isptr);

Inside that function on line 2852, the expectedType will be set to plain old Bool_t. Sort of reasonable, apart from it’s a vector not a straight bool. I do have branches that are single bools, like “EventIsGood”, so I would expect that to get some kind of different treatment. Then we get down to line 3022, where it throws an error;

The pointer type given "ROOT::VecOps::RVec<bool>" does not correspond to the type needed "Bool_t" (18) by the branch: JetIsGood

I know that error is thrown a few times, but I checked, it’s certainly the version on line 3022. It seems like I’m misusing this somehow. But I really don’t expect this branch type to be a Bool_t, it should be some kind of vector for sure. Any pointers on what I’m doing wrong?

I guess @pcanal can help.

Can you show the result of:

mytree->GetBranch("JetIsGood")->Print();

?

Thanks for taking a look at this @pcanal

root [15] datatree->GetBranch("JetIsGood")->Print()
*Br    1 :JetIsGood :                                  *
*         | JetIsGood[nJets]/O       *
*Entries :    18831 : Total  Size=     271527 bytes  File Size  =      38607 *
*Baskets :       11 : Basket Size=      32000 bytes  Compression=   7.01     *
*............................................................................*

I also tried with floats, which seem to work about the same way. I just haven’t really figured out variable length branches.

RDataFrame uses an indirect method to connect a RVec to this type of branch. (i.e. SetBranchAddress is indeed expected to reject this). If you need to probe it is is going to be handle-able by RDataFrame, you will need to use GetExpectType and then figure out that it is an array (eg. by get the underlying leaf and asking it has a non-null GetLeafCount() … which would tell you if it is a collection (could be an array or part of an STL collection), either way it should be loadable into a RVec.

Ok, thanks, that’s really helpful information. Essentially, I need to replicate that indirect method (loading the vector into a RVec). I’d hoped that by understanding why CheckBranchAddressType failed would tell me what I’m getting wrong, but it sounds like what I’m doing now is a dead end.

Unless I’m misreading it RRootDS actually doesn’t handle this. It would fail with the same error I got in the first post if it tried to read a vector column. But I don’t think that class is in active use anymore.

Does the indirect method that RDataFrame is using to read vector branches start here; root/RTreeColumnReader.hxx at b1b6d269215eeeff07f99e4e94623c49fd920243 · root-project/root · GitHub
?

Hello @Day ,

you are correct that RRootDS is unused (see e.g. its docs) and the code you linked to is exactly the one that implements the “magic” to expose arrays as RVecs in RDataFrame.

Basically RDF reads array-like branches with TTreeReaderArray and then constructs RVec objects that act as views over the memory buffer handled by TTreeReaderArray.

Out of curiosity, why do you need to replicate this behavior in a custom data-source?

Cheers,
Enrico

Many thanks :slight_smile:

So the project is to allow the reading of an xAOD file and a friend tree via RDataFrame.
Reading an xAOD in RDataFrame is handled by Attila’s subclass of RDataSource. I want to extend that class to also permit a friend tree, and I’ve made a first pass at it here.

Unfortunately, my code errors if the friend tree contains a variable length branch. It throws the error described in the first post. I’m hopeful that by looking at the way RDataSource would do that trick I can fix my issue (and probably just by getting my head round some more basic examples of reading vectors from trees).

I see – feel free to hop on ROOT’s mattermost team to discuss development in case you have any more questions.

Cheers,
Enrico

1 Like

That sounds like a very useful discussion to follow/place to ask questions. How can I join that?
Thanks, H

I’ll send you a private link just to avoid posting the link where web crawlers see it, but it should be easy to find for other people that end up reading this conversation :slight_smile:

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.