Question:
Is it possible to get unmanaged pointer to spread somehow. Currently i use Marshal to copy array from managed to unmanaged memory, but I would like to avoid this operation, because I want to handle huge arrays fast. I have made c++ managed project that has unmanaged methods, some of them written in assembly. This kind of code is really fast, but interoperation between managed code and this is slowing down things. Is it possible to write unamanged node plugin for vvvv? And if it is is there some example? Most of the time when I want to do some operations on huge arrays I don’t need double values, but mostly byte or 16 bit float values. Then I could write assembly that uses mmx, sse instructions and this speeds up computation massively. With 64 bit processors gain can be like 80%.
Idea:
I wanted to create substitution nodes for arithmetic operations that use mmx, sse instructions and can operate on big arrays. I would create assembly for manipulating 8bit, 16bit integer arrays using mmx, sse instructions depending on need, this would improve performance.
Another thing that I have seen here is dottores approach for manipulating huge amounts of data (gpu particles) using textures and gpu. I have been thinking on creating nodes that could receive spread and create dx texture from it and create device inside node which would contain shader code. Output would be converted from texture into a spread once shader code is executed. While this is good for manipulating vectors or any large arrays. There is problem that shaders are limited to texture input, so I don’t know how to implement something like particle collision with world or to receive some other kind of data into shaders. I haven’t looked into cuda or OpenCL documentation jet, but this might be great solution for parallel processing of large arrays. While 64bit cpu can operate on eight 8bit values simultaneously gpu can do much more, like 200-300 32bit float values in one cycle. If nodes could be created that would use OpenCL for arithmetic operations on value spreads in vvvv this would create amazing possibilities and would speed up things. You could have few gpus in pc and use their memory for spread storage and operations. Most of operations in vvvv can be done in parallel. This might seem complicated, but I think it is simple and can be done easily. Only thing I need is direct access to vvvv memory.
Project description:
I have created managed implementation for PhysX nodes. Till now I have created poly meshes, convex meshed, cloth, triggers, events. There were no problems and it is all working just fine. But now I have created fluid node and emitters. Well these nodes could generate huge spreads of particle data (like gpu particles from dottore). When I look at timings (debug mode) I see that by far the slowest operations are arithmetic operations inside vvvv. I output spread of vectors representing particle positions and spread of particle age. So I do some operations over them, lets say I generate spread of colors in relation to particle age. This results in few double operations per 3000 particles and it impacts performance.