The Mudcat Café TM
Thread #125269   Message #2773930
Posted By: treewind
26-Nov-09 - 04:01 AM
Thread Name: Tech: Misinterpreted Characters
Subject: RE: Tech: Misinterpreted Characters
AC - I think you underestimate the importance of ASCII compatibility in UTF-8. And I know Wikipedia is not the absolute fount of all knowledge, but: "Sorting of UTF-8 strings as arrays of unsigned bytes will produce the same results as sorting them based on Unicode code points."
Also it doesn't need a byte order mark. Some software relies on a UTF-8 marker to distinguish it from ASCII text, but that's not the best way to do it. A lot of software designed for ASCII will work with UTF-8, but it won't work with UTF-16.

As for bit twiddling, the world is full of compression algorithms that do far more of that, but nobody complains about that.

Anahata