I thought this was a good overview of the idea Triton can circumvent the CUDA mo...

hedgehog · on May 26, 2023

I hadn't looked at Triton before, I took a quick look at it and how it's getting used in PyTorch 2. My read is it really lowers the barrier to doing new hardware ports, I think a team of around five people within a chip vendor's team could maintain a high quality port of PyTorch for a non-NVIDIA platform. That's less than it used to be, very cool. The approach would not be to use any of the PTX stuff, but to bolt on support for say the vendor's supported flavor of Vulkan.