If you express your grammar in a Pratt parser (http://en.wikipedia.org/wiki/Pratt_parser), which associates parsing actions with tokens, the amount of "token overloading" in the syntax becomes obvious.
As for why tokens are overloaded--you just run out of good tokens if you don't reuse them. Consider the tokens '(' and '['. It's quite common to use '(' in the prefix context for grouping, and '(' in the infix context to mean a function call. How do you eliminate the overloading? You can make function calls use whitespace as an infix operator, but that creates a host of other problems. Also, overloading is useful for creating parallelism in the syntax. You might use '[' in the prefix context to signify literal arrays and '[' as an infix operator to signify array dereferencing. In that situation, overloading is synergestic.
As for why tokens are overloaded--you just run out of good tokens if you don't reuse them. Consider the tokens '(' and '['. It's quite common to use '(' in the prefix context for grouping, and '(' in the infix context to mean a function call. How do you eliminate the overloading? You can make function calls use whitespace as an infix operator, but that creates a host of other problems. Also, overloading is useful for creating parallelism in the syntax. You might use '[' in the prefix context to signify literal arrays and '[' as an infix operator to signify array dereferencing. In that situation, overloading is synergestic.