I am trying to define a python function, define a new column on my RDF using this function, and save it via snapshot but at the last stage in the snapshot the new column is not understood, The simple version of the code is below
_
Thanks for the post. We acknowledge your question. While the experts look at the script (@vpadulan) can you confirm you see the same with any 6.30 release?
Thanks for reaching out to the forum and sorry for the late reply. I took a quick look at your code and noticed that you pass the third argument of the Snapshot call with curly brackets instead of square brackets. Before proceeding further with debugging, could you try passing a Python list like ["elemvaid"]? Let me know if this helps.
Ok thanks. I actually started reading your snippet bottom-up, and I didn’t notice the upper part of it. You are creating a Python function, then you are trying to use it while defining the body of a C++ function to the cling compiler. Apart from the fact that both the Python function and the C++ function have the same name, so at best you will incur in an infinite recursion problem, I would like to step back to the general approach you are trying. I have not seen yet such an example of simply “calling” a Python function in C++ code declared to cling. Did you do this because you followed some tutorial, or some piece of documentation? What are you trying to achieve there?
Thank you very much for your reply. So sorry for my late reply as I oddly missed your response.
The idea of writing this script is to define a function (which only needs to be written within python due to use of an external package) and then use this function to define a new column in the root dataframe and save the column via snapshot.
With the part within this declaration
ROOT.gInterpreter.Declare(“”"
“”")
I tried to declare this function which apparently is not correctly implemented as you said. I hope It is clearer now. Please let me know if it needs more clarification. Do you have any clue how it should be implemented?
Calling a pure Python function from within the RDataFrame event loop is just not possible (at the moment surely). There are a few ways that you could tackle this challenge in general. One way is to use ROOT.Numba.Declare to JIT your Python function via Numba, if that is possible. See an example at ROOT: tutorials/pyroot/pyroot004_NumbaDeclare.py File Reference.
Thanks a lot.
The first option sounds very good but it seems it has a problem when one loops over the argument elements. I have attached a simple code with 2 functions, the first function is working but the second one which is commented now produces some errors regarding the jitter e.g
raise Exception(‘Failed to jit Python callable {} with numba.jit’.format(func))
Do you have any idea how to solve this jit problem?