I'm very happy to hear this; maybe it's finally time to buy a ton of ram for my PC! A local, private LLM would be great. I'd try talking to it about stuff I don't feel comfortable being on OpenAI's servers.
Getting lots of ram will let you run large models on the CPU, but it will be so slow.
The Apple Silicon Macs have this shared memory between CPU and GPU that let's the (relatively underpowered GPU, compared to a decent Nvidia GPU) run these models at decent speeds, compared with a CPU, when using llama.cpp.
This should all get dramatically better/faster/cheaper within a few years, I suspect. Capitalism will figure this one out.
There's nothing Mac specific about running LLMs locally, they just happen to be a convenient way to get a ton of VRAM in a single small power efficient package.
In Windows and Linux, yes you'll want at least 12GB of VRAM to have much of any utility but the beefiest consumer GPUs are still topping out at 24GB which is still pretty limiting.
With Windows/Linux I think the issue is that NVidia is artificially limiting the amount of onboard RAM (they want to sell those devices for 10x more to openai, etc) and that AMD for whatever reason can't get their shit together.
I'm sure that there are other much more knowledgeable people here though, on this topic.