I don't think it's a hard choice for the browser to make- send the data as it is captured by the camera. How that is displayed is something else, but as timdorr demonstrates, it's not difficult to toggle.
In what way? The developer should have no problem justifying the decision. "It's what the camera sensor sees".
I absolutely understand the application in webchat and why you would want it to be mirrored at a page level, but I'm at a loss to understand why the browser implementation would try to reflect that.
I agree that it's a hard choice; I had to make this exact choice last week when working on a video chat service. Possibly because of the nature of the service (tutoring/teaching) it felt incredibly unnatural to see a non-mirrored image of myself. I tried it out both ways and it really did feel quite peculiar, and I had a hard time figuring out my own movements (kinda like trying to shave using two mirrors).
However, I do hope the implementations are consistent as to which way it flips the video.