The Mudcat Café TM
Thread #125269   Message #2773432
Posted By: Simon G
25-Nov-09 - 11:28 AM
Thread Name: Tech: Misinterpreted Characters
Subject: RE: Tech: Misinterpreted Characters
Anahata - its the browsers responsibility to convert the text into the correct character set as specified by the page or the default setting before shipping it back to the server. Any glitches as a result of copy/paste should be visible to the user in the text they have pasted in. What they see in the text area the server will get.

As long as the server consistently stays in a character set and deals with any escaping required there won't be a problem. Problems on the server usually result from conversions to ASCII which lose information.

John - your getting you bytes and words mixed up. a byte is always 8bits, a 32 bit OS has a 32bit word (4 bytes), a 64 bit OS has a 64 bit word (8 bytes). As for data disappearing in conversions between 16, 32 and 64 bit, this would be in very poorly constructed software. Not something for us to worry about. The advent of full Unicode fonts means operating systems load characters or pages of characters on demand so there is no extra load, other than disk space. Correct me if I'm wrong but HTML never down loads a font.

As for DT, the time is long past that is should be in an ISO character set. The tools do full internationalise are ubiquitous, maybe you don't need to support mandarin or arabic but you will probably get it all for free. The archaic link in Susan's process is the ancient copy of askSAM - perhaps time for an upgrade to a newer version.

Simon