Hacker News new | past | comments | ask | show | jobs | submit login
DeepSeek Open Source DeepGEMM – FP8 GEMM Library(300 lines for 1350+ FP8 TFLOPS) (twitter.com/deepseek_ai)
4 points by helloericsf 69 days ago | hide | past | favorite | 1 comment



Github: https://github.com/deepseek-ai/DeepGEMM

- Up to 1350+ FP8 TFLOPS on Hopper GPUs - No heavy dependency, as clean as a tutorial - Fully Just-In-Time compiled - Core logic at ~300 lines - yet outperforms expert-tuned kernels across most matrix sizes - Supports dense layout and two MoE layouts




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: