Better design for my RDataSource?

Day · January 31, 2023, 2:14pm

There is an excellent RDataSource subclass that allows xAOD files, and I’m trying to add friend trees. The way I’m trying to do it right now seems to be creating a lot of code duplication, so I was hoping that you could offer me some advice on making this more elegantly?

What I have right now

On top of the existing RDataSource;

On initialisation, I get a list of all branches in the friend trees, and their types.
When each slot is initialised (InitSlot( unsigned int slot, ULong64_t firstEntry )),I create a second TChain containing the xAOD and friends, specifically for reading data from the friend trees. For each branch in the friend tree that is not a container I use SetBranchAddress to get a pointer to the element, stored as a void *.
When GetColumnReaders is called, it’s the pointers created in InitSlot that are returned.
When each entry is set (SetEntry( unsigned int slot, ULong64_t entry )), for the non-containers it’s enough to call GetEntry on the friend chain for the slot. For the container branches (RVec) it’s going to be ugly, it seems I will need to copy the content of the container each time a new event is set; as in root/RTreeColumnReader.hxx at b1b6d269215eeeff07f99e4e94623c49fd920243 · root-project/root · GitHub My main concern is this seems unavoidable, even for branches that are not used.

Can you see a neater alterative?
Any way to limit the columns I need to read from in GetEntry?

Edit; so actually I realise that unless a column has been accessed with GetColumnReaders, there is no need to do anything about it. So maybe the simplest way is to do the SetBranchAddress and/or create placeholder RVec pointers in GetColumnReaders, and keep track of which columns are in use there, so that only container branches that are actually in use get copied in SetEntry?

Axel · February 6, 2023, 2:32pm

Thanks for your question, @Day and apologies for the slow reaction. Certainly @eguiraud can help here - but we can discuss these kind of questions probably better in Mattermost, as Enrico already proposed in your other question?

eguiraud · February 7, 2023, 4:33pm

I guess this is because the xAOD datasource uses the old GetColumnReaders API, the one that returns a vector<T **>, so you have to load all values for all columns that might be needed ahead of time.

If that’s the case you might want to switch to the new GetColumnReaders API, the one that returns a unique_ptr<RColumnReaderBase>. You can then put some logic in your RXAODColumnReader (a type that has to inherit from RColumnReaderBase) that copies the RVec contents on demand.

I hope this helps. Feel free to use the ~PPP channel on ROOT’s mattermost for more development discussion.

Cheers,
Enrico

Day · February 10, 2023, 12:47pm

Thanks, that’s a very useful thing to know about.

system · February 24, 2023, 12:47pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.