One of the most frequent questions one faces while running LLMs locally is:
I have xx RAM and yy GPU, Can I run zz LLM model ?
I have vibe coded a simple application to help you with just that.
Update:
A lot of great feedback for me to improve the app. Thank you all.
I can absolutely run models that this site says cannot be run. Shared RAM is a thing - even with limited VRAM, shared RAM can compensate to run larger models. (Slowly, admittedly, but they work.)