Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
SparkyMcUnicorn
6 days ago
|
parent
|
context
|
favorite
| on:
Running Qwen3 on your macbook, using MLX, to vibe ...
You can use 0.6B for speculative decoding on the larger models. It'll speed up 32B, but slows down 30B-A3B dramatically.
Consider applying for YC's Summer 2025 batch! Applications are open till May 13
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: