Using python class members with numba declare reading RVec<double> columns

Dear experts,
I am currently trying to use RDataFrame to perform some operations i had to do with awkard arrays RDataFrame to use a custom python class i coded.

Basically I have a TTree which contains (x,y,z) locations of hits in a tracker in the form of

RVec x , … columns

I have coded using shapely a masking shape which allows me to flag , depending on x-y-z location what to do.

Now what i would like to achieve is the following

acceptance = createAcceptanceFunction() #this create a custom Python Class which internally make use of shapely library

df.Define( "flag" , "acceptance.GetFlag( x,y,z)")

I was wondering, how the Numba Declare could deal with RVec column inputs returning RVec outputs.

ANy suggestions?

Thanks in advance
Renato

Hi @RENATO_QUAGLIANI,

thanks for reaching out!
I’m not quite clear on what you want to do. Do you what to use Numba with your createAcceptanceFunction class that involves RVec objects and shapely library? Could you please post a small code snippet that reproduces your situation?

Cheers,
Monica

1 Like

Hi @mdessole ,

This is a pseudo code i want to use, but it crash badly in the jit compiling step

#TestSciFiNumba
import ROOT
from geometry.FTAcceptanceXY import FTAcceptanceXY
from kernel import MCHit

MT = FTAcceptanceXY("geometry/MightyTracker.geom")
### you cam mimic my case with a simple 

class MCHit : 
    def __init__(self, x,y,z, time, key): 
           self.X = x
           self.Y = y 
           self.Z = z 
class FTAcceptanceXY: 
      def __init__(self, goeometryfile) : 
           #do nothing 
      def IsPixelLayer( hit : MCHit) : 
           return True 
            
@ROOT.Numba.Declare(['RVecD', 'RVecD'], 'RVecB')
def IsPixelLayer(x, y, z):    
    toreturn = []
    for _x, _y, _z in zip( x,y,z): 
        toreturn.append( MT.IsPixelLayer( MCHit( _x, _y, _z, 0, 0,0)))
    return np.array(toreturn)

Basically i want to do a df.Define( "pixel" , "IsPixelLayer( x,y,z)") where x,y,z are RVec double of same length. My “pixel” branch has to be another RVec booleans of same length, but the call has to use my own python classes in the event loop

Thanks @RENATO_QUAGLIANI.
I have a few of questions for you:

  • here I see two declarations of IsPixelLayer, one as the function you’re trying to jit and one as a method of the class FTAcceptanceXY, is it correct?
  • should the signature of your decorator be ['RVecD', 'RVecD', 'RVecD'], 'RVecB'?
  • does your code work without jitting?
  • can you post also the error you get?
1 Like

Hi @mdessole , this is what i get with a cleaned up version snippet.

#TestSciFiNumba
import ROOT
import numpy as np

##some classes in python
# from geometry.FTAcceptanceXY import FTAcceptanceXY [ custom ones imported in case]
# from kernel import MCHit [ custom ones imported in case]
class MCHit : 
    def __init__(self, x,y,z, time, key):
        self.X = x
        self.Y = y
        self.Z = z 
##some classes in python        
class FTAcceptanceXY: 
    def __init__(self, goeometryfile : str) : 
        self.OK = True
        #do nothing
    def IsPixelLayer( hit : MCHit) : 
        return True
MT = FTAcceptanceXY("geometry/MightyTracker.geom")

@ROOT.Numba.Declare(['RVecD', 'RVecD', 'RVecD'], 'RVecB')
def IsPixelLayer(x, y, z):
    toreturn = ROOT.RVecB()
    for _x, _y, _z in zip( x,y,z):
        toreturn.push_back( MT.IsPixelLayer( MCHit( _x, _y, _z, 0,0)))
    return toreturn

This is the message i get :

---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
File ~/root_src/build/lib/ROOT/_numbadeclare.py:150, in _NumbaDeclareDecorator.<locals>.inner(func, input_types, return_type, name)
    149 if nb_return_type is not None:
--> 150     nbjit = nb.jit(nb_return_type(*nb_input_types), nopython=True, inline='always')(func)
    151 else:

File /usr/local/lib/python3.11/site-packages/numba/core/decorators.py:241, in _jit.<locals>.wrapper(func)
    240 for sig in sigs:
--> 241     disp.compile(sig)
    242 disp.disable_compile()

File /usr/local/lib/python3.11/site-packages/numba/core/dispatcher.py:965, in Dispatcher.compile(self, sig)
    964 try:
--> 965     cres = self._compiler.compile(args, return_type)
    966 except errors.ForceLiteralArg as e:

File /usr/local/lib/python3.11/site-packages/numba/core/dispatcher.py:129, in _FunctionCompiler.compile(self, args, return_type)
    128 else:
--> 129     raise retval

File /usr/local/lib/python3.11/site-packages/numba/core/dispatcher.py:139, in _FunctionCompiler._compile_cached(self, args, return_type)
    138 try:
--> 139     retval = self._compile_core(args, return_type)
    140 except errors.TypingError as e:

File /usr/local/lib/python3.11/site-packages/numba/core/dispatcher.py:152, in _FunctionCompiler._compile_core(self, args, return_type)
    151 impl = self._get_implementation(args, {})
--> 152 cres = compiler.compile_extra(self.targetdescr.typing_context,
    153                               self.targetdescr.target_context,
    154                               impl,
    155                               args=args, return_type=return_type,
    156                               flags=flags, locals=self.locals,
    157                               pipeline_class=self.pipeline_class)
    158 # Check typing error if object mode is used

File /usr/local/lib/python3.11/site-packages/numba/core/compiler.py:770, in compile_extra(typingctx, targetctx, func, args, return_type, flags, locals, library, pipeline_class)
    768 pipeline = pipeline_class(typingctx, targetctx, library,
    769                           args, return_type, flags, locals)
--> 770 return pipeline.compile_extra(func)

File /usr/local/lib/python3.11/site-packages/numba/core/compiler.py:461, in CompilerBase.compile_extra(self, func)
    460 self.state.lifted_from = None
--> 461 return self._compile_bytecode()

File /usr/local/lib/python3.11/site-packages/numba/core/compiler.py:529, in CompilerBase._compile_bytecode(self)
    528 assert self.state.func_ir is None
--> 529 return self._compile_core()

File /usr/local/lib/python3.11/site-packages/numba/core/compiler.py:508, in CompilerBase._compile_core(self)
    507         if is_final_pipeline:
--> 508             raise e
    509 else:

File /usr/local/lib/python3.11/site-packages/numba/core/compiler.py:495, in CompilerBase._compile_core(self)
    494 try:
--> 495     pm.run(self.state)
    496     if self.state.cr is not None:

File /usr/local/lib/python3.11/site-packages/numba/core/compiler_machinery.py:368, in PassManager.run(self, state)
    367 patched_exception = self._patch_error(msg, e)
--> 368 raise patched_exception

File /usr/local/lib/python3.11/site-packages/numba/core/compiler_machinery.py:356, in PassManager.run(self, state)
    355 if isinstance(pass_inst, CompilerPass):
--> 356     self._runPass(idx, pass_inst, state)
    357 else:

File /usr/local/lib/python3.11/site-packages/numba/core/compiler_lock.py:35, in _CompilerLock.__call__.<locals>._acquire_compile_lock(*args, **kwargs)
     34 with self:
---> 35     return func(*args, **kwargs)

File /usr/local/lib/python3.11/site-packages/numba/core/compiler_machinery.py:311, in PassManager._runPass(self, index, pss, internal_state)
    310 with SimpleTimer() as pass_time:
--> 311     mutated |= check(pss.run_pass, internal_state)
    312 with SimpleTimer() as finalize_time:

File /usr/local/lib/python3.11/site-packages/numba/core/compiler_machinery.py:273, in PassManager._runPass.<locals>.check(func, compiler_state)
    272 def check(func, compiler_state):
--> 273     mangled = func(compiler_state)
    274     if mangled not in (True, False):

File /usr/local/lib/python3.11/site-packages/numba/core/typed_passes.py:110, in BaseTypeInference.run_pass(self, state)
    107 with fallback_context(state, 'Function "%s" failed type inference'
    108                       % (state.func_id.func_name,)):
    109     # Type inference
--> 110     typemap, return_type, calltypes, errs = type_inference_stage(
    111         state.typingctx,
    112         state.targetctx,
    113         state.func_ir,
    114         state.args,
    115         state.return_type,
    116         state.locals,
    117         raise_errors=self._raise_errors)
    118     state.typemap = typemap

File /usr/local/lib/python3.11/site-packages/numba/core/typed_passes.py:89, in type_inference_stage(typingctx, targetctx, interp, args, return_type, locals, raise_errors)
     87     infer.seed_type(k, v)
---> 89 infer.build_constraint()
     90 # return errors in case of partial typing

File /usr/local/lib/python3.11/site-packages/numba/core/typeinfer.py:1039, in TypeInferer.build_constraint(self)
   1038 for inst in blk.body:
-> 1039     self.constrain_statement(inst)

File /usr/local/lib/python3.11/site-packages/numba/core/typeinfer.py:1386, in TypeInferer.constrain_statement(self, inst)
   1385 if isinstance(inst, ir.Assign):
-> 1386     self.typeof_assign(inst)
   1387 elif isinstance(inst, ir.SetItem):

File /usr/local/lib/python3.11/site-packages/numba/core/typeinfer.py:1461, in TypeInferer.typeof_assign(self, inst)
   1460 elif isinstance(value, (ir.Global, ir.FreeVar)):
-> 1461     self.typeof_global(inst, inst.target, value)
   1462 elif isinstance(value, ir.Arg):

File /usr/local/lib/python3.11/site-packages/numba/core/typeinfer.py:1561, in TypeInferer.typeof_global(self, inst, target, gvar)
   1560 try:
-> 1561     typ = self.resolve_value_type(inst, gvar.value)
   1562 except TypingError as e:

File /usr/local/lib/python3.11/site-packages/numba/core/typeinfer.py:1482, in TypeInferer.resolve_value_type(self, inst, val)
   1481     msg = str(e)
-> 1482 raise TypingError(msg, loc=inst.loc)

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Untyped global name 'MT': Cannot determine Numba type of <class '__main__.FTAcceptanceXY'>

File "../../../../../../var/folders/lf/0b5lm39s3q5_bdp_6gjg7ynh0000gq/T/ipykernel_12471/4139421318.py", line 26:
<source missing, REPL/exec in use?>


During handling of the above exception, another exception occurred:

Exception                                 Traceback (most recent call last)
Cell In[27], line 22
     19         return True
     20 MT = FTAcceptanceXY("geometry/MightyTracker.geom")
---> 22 @ROOT.Numba.Declare(['RVecD', 'RVecD', 'RVecD'], 'RVecB')
     23 def IsPixelLayer(x, y, z):
     24     toreturn = ROOT.RVecB()
     25     for _x, _y, _z in zip( x,y,z):

File ~/root_src/build/lib/ROOT/_numbadeclare.py:155, in _NumbaDeclareDecorator.<locals>.inner(func, input_types, return_type, name)
    153         nb_return_type = nbjit.nopython_signatures[-1].return_type
    154 except:
--> 155     raise Exception('Failed to jit Python callable {} with numba.jit'.format(func))
    156 func.numba_func = nbjit
    157 # return_type = "int"

Exception: Failed to jit Python callable <function IsPixelLayer at 0x1a6c54e00> with numba.jit

it seems to me that the external MT object i create which does a call in the Numba declare is not picked up

Any news on this?

Basically i think the issue might be that a python class method cannot be used in numba. Is that the case?

I believe , at least from my special use case, that interoperate python powerful tools objects into the RDataFrame event loop operation might be a very interesting features if available…

In my use case i have different x-y surfaces to declare acceptance of a detector which are ‘easily’ generated from shapely. Then, i have a set of x,y hits around in a detector in a tuple.

What i want to do is then to visualize the obtained occupancy of the detector with this.

If it would be possible to achieve in ROOT it would be very good, the alternative i have is using awkard array conversion, looping each single hit and then create new containers…this is quite slow to do and i was looking for a solution which can act directly on the RDataFrame event loop

Hi Renato, sorry for the late aswer.

You’re indeed right, in order to be able to jit a function its body must not include external functions or classes. For example, the following code causes as error

def pypow(e,y):
    return e**y

@ROOT.Numba.Declare(['RVec<float>', 'float'], 'float') 
def pypowsum(x, y):
    s = 0
    for e in x:
        s += pow(e,y)
    return s

while the following equivalent version does’t.

@ROOT.Numba.Declare(['RVec<float>', 'float'], 'float') 
def pypowsum(x, y):
    s = 0
    for e in x:
        s += e**y 
    return s

So what you might want to try is it explicitly place the code into the function you want to jit.
Thank you for your suggestions, the users’ feedback is very valuable for us!

Cheers,
Monica

Hi @mdessole , thanks a lot, i went a bit further and i ended up looking up what numba python can do and cannot do.

In principle python classes can be jitted with Compiling Python classes with @jitclass — Numba 0.52.0.dev0+274.g626b40e-py3.7-linux-x86_64.egg documentation
However, the methods of the classes has to avoid using self basically. Tough, it might be good if there is any way one can use straightly numba classes and decorators to allow the event loop using a custom python class object.

I guess for my use case i might better try to see if I can save my geometry to geojson and try to load with the C++ libraries which can load the same files. So maybe it would be just a matter of saving geometries from shapely in a format from python so that i can load them back in C++ and make the C++ call directly inside a python function.

In general i see quite some advantages to be able to perform the operator() call using directly objects created in python or customly defined.

Ok let me then reiterate a bit on this.

Since i need to pass through awkard arrays , i would like to filter the RVec columns now.

Say , j am interested to perform the conversion only to those hits having a z and y in a given location.

My columns in the tree are all RVec of the same size, effectively, how can i create a mask, define some new reduced container of them?

Say i want to Take or awkard array convert the dataframe once i select only

x[ z> Value && y < Value]
y[ same selection] … etc.

Is there a way to do this in RDataFrame? What would be the syntax?

FILTER seems unhappy, should i use Reduce?

Thanks
Renato

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.