Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
villgax
on Aug 15, 2023
|
parent
|
context
|
favorite
| on:
GPU-Accelerated LLM on an Orange Pi
I'm already getting 1.5tok on Ubuntu running on Android via UserLand w/ Llama.cpp(v2-Q4). Don't really see acceleration. If anything I need to see my phone do something actually useful at let's say 7-10toks
regularfry
on Aug 15, 2023
|
next
[–]
Human speech is in the 2-4 tokens per second range, I think that's about where my frustration limit is.
brucethemoose2
on Aug 15, 2023
|
prev
[–]
mlc should already be pretty fast on Vulkan
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: