OK, but we developed or mental model of the world through those two cameras. I agree that we still have aways to go, the fact is only two cameras and processing is all that is needed. But we can do better with more sensors
If a company's solution to self driving cars with just two cameras requires developing a machine learning "model of the world" (I don't think it does, but it does make it a much harder research problem), then they are going to be years behind everyone else in shipping a self-driving car.
If a company's solution is able to maintain a real-time model of the world on top of which reasoning and reaction at human-level speeds is possible, never mind the driving cars - that's priceless!
That's the understatement of the week right there. We've been working on that tiny little problem of "...and processing" for about a century (wall time), yet the result is still quite rudimentary.