PROOF on Nvidia Fermi / CUDA?

as far as I am aware, the newest Nvidia Fermi / CUDA cards are able to fully work in “dual precision” on something like 500 of cores in parallel and are not that expensive. 8)

Dashing, no? :bulb:

I’ve been thinking that maybe that would be interesting as a PROOF “virtual PROOF-Worker cluster” target (the “virtual PROOF_Master” could be the main CPU of the machine that has the Fermi card build-in, or maybe even one of the available Fermi cores).

I am stupid. No?
Pepe Le Pew.

No fans of Science Fiction and / or Fantasy in here? :mrgreen:


I have been thinking a bit on how to utilise GPUs in ntuple analysis, the fundamental problem is the I/O limitation. For CUDA/OpenCL to make any sense, the input data has to be copied to the GPU memory, so unless you do some fairly heavy computations on a small data set, it doesn’t make any sense. In my work the I/O performance vastly outweighs the CPU time. So until someone figures out how to have a continuous stream of data going through the GPUs rather than moving one set of data in, and then out, all the time, I can’t see it as a HEP analysis tool :-/

But, if you can prove me wrong, I’ll be glad to spend some time giving it a shot :slight_smile:

It should be possible to painlessly gain some performance using Thrust instead of STL. I have some intense data manipulations and histogram fillings where I/O is not a limiting factor, but CPU is. Though this will require modifications of a lot of ROOT code anyway with a questionable benefit, GPU <-> system memory exchange may become a new bottleneck.