The Mudcat Café TM
Thread #12759   Message #3057855
Posted By: Artful Codger
20-Dec-10 - 11:19 AM
Thread Name: HTML Practice Thread
Subject: RE: Input conversion
Now, John contends that only a few users (like Amos) are entering text which will misdisplay. As a Mac user, with different native codepage mappings from Windows users, I know this to be false. I see many pages where folks are posting raw 8-bit characters instead of escaped characters, and the result is munged text to non-Windows users (or users in different locales). I don't know when the most recent auto-conversion changes were implemented, but within the last month or so I have seen postings entered in Cyrillic which were not converted to proper escapes. Presumably, if they had been received as Unicode, they would have been (like the fourth character in my test). And presumably they appeared correct to the poster, both in the input box before posting and in the thread after posting. Conclusion: what YOU see (as an American or English Windows user), even in preview, is not necessarily what others will see. And a quick peek at source views will bear out what I'm saying. If it ain't escaped and it ain't 7-bit ASCII, it ain't right.

That said, why would text arrive natively encoded, in this age where Unicode transfer is the preferred method? I've got some ideas, but I don't have time to present them now.