> * Word-addressed memory will make string processing interesting to implement. Maybe in space, everyone uses UTF-16.
Everyone in Minecraft, too -- almost. The string encoding in Minecraft's protocol spec is UCS-2, just a sneeze away from UTF-16. It seems Notch has a soft-spot for large encodings. It makes sense from a calculation and lookup perspective, but I wonder if the increased bandwidth and storage of 16-bit blocks has a measurable impact.
By going UTF-16, he would make the game much more accessible to those not using a latin alphabet.
If I was designing the game, I would favor internationalization vs space efficiency in the virtual computer. That way russian kids would get to have fun learning to write silly programs too.
UTF-8 still has all the characters UTF-16 does it just uses more bytes to encode them. UTF-8 makes more sense if most of your characters can be represented in 8 bits.
you can encode all characters in the basic multilingual plane with a single utf-16 code unit, while in utf-8 BMP characters have variable lengths from 1 to 3 bytes.
If Notch only allows BMP characters and uses utf-16, internationalized string maniulation is very easy, as easy as ASCII for kids to mess with.
Everyone in Minecraft, too -- almost. The string encoding in Minecraft's protocol spec is UCS-2, just a sneeze away from UTF-16. It seems Notch has a soft-spot for large encodings. It makes sense from a calculation and lookup perspective, but I wonder if the increased bandwidth and storage of 16-bit blocks has a measurable impact.