Is there a way to vectorize that and turn it into a GPU kernel using a macro?

nestorD · on April 9, 2020

As Julia illustrates, you do not need to do the vectorization and GPU part within the differentiation.

You can have a code using a GPU array library and just differentiate it (which ends up being more flexible / composable).

fluffything · on April 9, 2020

You can write a #[cuda] proc macro that:

- parses the Rust AST

- folds it into a CUDA C AST

- writes it to a temporary .cu file

- compiles it with nvcc

- links it into your Rust binary

and that allows you to launch your function as a CUDA kernel.

The Rust emu crate does this (more or less), but targets WebGPU instead of CUDA.