The Mudcat Café TM
Thread #135626 Message #3094371
Posted By: GUEST,Grishka
13-Feb-11 - 11:24 AM
Thread Name: Tech: Non-ASCII character display problems
Subject: RE: Tech: Non-ASCII character display problems
Here is my first proposal, let's call it Grishka1 for reference. It solves the problem completely for all future posts, to new or old threads, with minimal programming effort. The display of old messages is unchanged.
The following changes are being proposed:- The thread pages are unchanged, except that the whole entry area is replaced by a simple button "Answer", which technically works like a link with "target _blank", e.g.
<A HREF="answer.cfm?Thread_ID=135626" target="_blank"> <IMG SRC="/graphics/!answerButton.gif" WIDTH="50" HEIGHT="20" BORDER="0" ALT="Answer"></A>
- This means that, on pressing the button, users will see the page with the entry area in a new tab (page) of their browsers. To reread the thread or quote from it, they can always change to the former tab, and even reload it to view any contribution posted in the meantime. This I consider an advantage, not a trade-off.
- The page shall look and work exactly like the existing preview page – initially empty, of course. The important difference is that its invisible HTML header contains a line
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
which makes it work like an entryfield in Google, Wikipedia, and almost the rest of the world. In particular, it will accept any character directly. Escaping is no longer necessary, except for &, <, and perhaps >. If someone likes to enter escapes, e.g. because her/his keyboard does not yield the desired character (e.g. € for €), this will still work.
- On the Mudcat server, the script will now obtain all characters in genuine unicode, exactly as they were entered. It will routinely transform <em>all</em> characters above 255 to "ampersand escapes", so that future readers will be able to read them regardless of codepage, even when appearing in an old thread containing original sins.
Please discuss.
"Grishka1b": The script distinguishes between old and new threads, the latter being completely in UTF-8 to save space and bandwidth.
"Grishka2" may be a frameset (the top frame being the thread, the bottom frame containing the entry/preview area as a UTF-8 page), but I am not yet sure whether it will work and be desirable.