Can I use RDataFrame to interact with objects I’ve put in a branch of a TTree?
Suppose I have a class (Event) deriving from TObject, and a branch (“eventObj”) in the tree holding an instance of this class. How can I, for example, do a lambda function filter using this branch? I want something like:
auto cut1 = [](const Event* ev) { ... do stuff with ev; }
df.Filter(cut1,"eventObj").....
But when I try to do this I get ‘CreateProxy’ errors about the dictionary not existing. But I know this class is fine for e.g. doing tree->Scan( eventObj->blah() ) for example
Can I use RDataFrame to interact with objects I’ve put in a branch of a TTree?
Yes! For non-trivial classes, ROOT dictionaries must be available so that ROOT knows how to correctly read and write the class. If tree->Scan does not complain, dictionaries must be there somewhere, and you have to load them into your compiled code.
A small self-contained example of RDataFrame reading/writing a custom type:
// Event.h
#ifndef EVENT_H
#define EVENT_H
#include <TObject.h>
class Event : public TObject {
int x = 42;
public:
Event() {}
int GetX() const { return x; }
ClassDef(Event, 0);
};
#endif
.L Event.h+ creates the dictionaries and the interpreter looks for dictionaries in the current directory, so it finds them later when executing main().
My point is: RDataFrame works fine with custom objects, but dictionaries need to be available as per the error message. If TTree::Scan does not error out, that makes me think the dictionaries are there, somewhere, and you have to have the application pick them up.
@moneta or @pcanal can comment with more authority but I don’t think TTreeFormula supports computations with arbitrary types (such as void*, for example).
About using RDataFrame, the problem is that RDF really does not have good support for nullptr values. If you remove the second entry, the one with the nullptr, this works fine:
d.Foreach([](RooFitResult& b) { std::cout << b.GetName() << std::endl; }, {"fr"});
If you had another column that indicated presence/absence of a RooFitResult, you could work around this limitation of RDF by prepending a Filter([](bool isNull) { return !isNull; }, {"is_roofitresult_null"}) to the Foreach, so that the Foreach only processes events with non-null RooFitResults.
Ah thanks I think that might have been how I was getting stuck. I think that’s this problem solved.
I’m also assuming that all of this will be considered thread safe (i.e. that concurrent calls to my lambda will get references to different objects here.
I’m trying to learn what I can do with RDataFrame having come from my previous experience a few years ago of extensive advanced use of TTree::Draw/Scan trickery. I feel like at some point I used to be able to call methods on objects as part of a Draw/Scan, but it’s been a while and I seem to have forgot. But at least RDataFrame can use these lambdas ok then.
Will probably be back soon with more questions… thanks
Your lambdas must be safe to call concurrently when multi-threading is activated – but indeed, they will definitely get inputs corresponding to different events when called concurrently.
Please do! Actually, this is too good of a thread for it to stay in the Newbie section and get deleted in a couple of weeks. Would you be ok with promoting it to the ROOT section?
That is correct. The arguments to function call by TTree::Draw can only be simple numeric type. (But you can call data member function of complex objects as long as they take no argument or simple numerical argument).