There's also OpenAI Triton. People seem to miss that OpenAI is not using CUDA...

singhrac · on May 25, 2023

Yeah, also see AMD engineers working on Triton support here: https://github.com/openai/triton/issues/46

wmf · on May 25, 2023

Triton outputs PTX which still requires CUDA to be installed.

sanxiyn · on May 25, 2023

Sure, but the point is that Triton is not dependent on CUDA language or frontend. Triton also outputs PTX using LLVM's NVPTX backend. Devils are in the details, but at a very high level, Triton could be ported to AMD by doing s/NVPTX/AMDGPU/. Given this, people should think again when they say NVIDIA has CUDA moat.

DocSavage · on May 26, 2023

I thought this was a good overview of the idea Triton can circumvent the CUDA moat: https://www.semianalysis.com/p/nvidiaopenaitritonpytorch

It also looks like they added MLIR backend to Triton though I wonder if Mojo has advantages since it was designed with MLIR in mind? https://github.com/openai/triton/pull/1004

hedgehog · on May 26, 2023

I hadn't looked at Triton before, I took a quick look at it and how it's getting used in PyTorch 2. My read is it really lowers the barrier to doing new hardware ports, I think a team of around five people within a chip vendor's team could maintain a high quality port of PyTorch for a non-NVIDIA platform. That's less than it used to be, very cool. The approach would not be to use any of the PTX stuff, but to bolt on support for say the vendor's supported flavor of Vulkan.