A lot of Quest games can actually turn audio into mouth movements on an avatar. I play Walkabout Mini Golf with my friends a lot and it's incredible how natural the mouth movements seem to match my friends' speech. I'll turn to them complaining about par on a hole and I'll see them speak in response.
The new Quest headsets are supposed to have some features for facial expression tracking for exactly this reason. But yeah, no doubt there's still a ways to go for the experience to be worthwhile.
I think that Meta actually has facial tracking in one of their VR Headsets. They published a bunch of preliminary demos of stuff in video form a few months back, it's worth checking out.