Non-Thread safe RDataFrame operations admitted when custom classes stores pointers?

Dear experts,

I have a question on a functionality of RDataFrame which i am currently using for some prototyping code.

I have an ntuple which contains some “raw” double values of objects as vector

I have declared operator() which “construct objects” using them and create a column

operator()( ... ){
     auto a = map< int, Object> myObjects; 
     ... fill a
     return a;
}

those are obtained with a simple

node = node.Define( "myMap", functor, {...});

now the tricky bit is that i have multiple objects :

node = node.Define("myMap2", functor2, {...});

and myMap2 stores std::map<int, Object2> object2

Now the problematic business i am dealing with is the fact , only once those maps are created i want to “update” objects in the map with a pointer to the other container object
( e.g. set the Object1.setRelationProperty( & map2[key] ) updating the private member in Object1 with a pointer to an element in map2.

Switching off multi-threaded all this is working but i need to make a function doing :

auto link = []operator()( map<int, Object1> & obj, map<int, Object2> & objs2){ 
       ... here i do the linking relation setting , 
      return 1;
}

To ensure that further node operation uses the updated containers i need the “link” define token to be used as input in the next operations i want to do, to bypass the RDataFrame scheduler which would “forget” about this link-update step.

Everything seems to work for the moment, but when i perform a “Take” operation , sometimes the code is working, sometimes is not and i am a bit puzzled.

Any suggestion/help is more than welcome.

Best,
Renato

PS: Apologize if my question is naive, but all i am facing is just

  • “Create/Define an object container1 of Obj1”
  • “Create/Define an object container2 of Obj2”
  • “Update the container1/2 in a single Define (not copying stuff, have from Obj1 access to Obj2 and modify it)”

Thenk do a Take the containers with all properties linked to do some local debugging / development of what to write as next “functor”

In old-root one would have done it with a plain loop over events and perform operations , here i realized that “RDF” allows to update objects created by an operation, since it doesn’t complain about having
operator()( myContainer & cont1) being used as argument. ( but then to use the updated one, you need to use as input column a “token” of the operation to ensure the event loop scheduler accounts for the relation of “orders-of-Defines” done.

Hi Renato,

I understand that when everything is single threaded, the program works and it stops working in MT mode: is this correct? If yes, Are you sure all operations performed are thread safe?

Best,
D

Hi @Danilo , i am not willing to make it work MT, single threaded works and i suspect the issue was about the memory layout used for my objects.in general i guess that rather than storing pointers to other containers i better store indices from the other container to retrieve the linked object.

Best