The Mudcat Café TM
Thread #137789   Message #3153576
Posted By: JohnInKansas
13-May-11 - 04:45 PM
Thread Name: Mudcat rewriting preformatted text
Subject: RE: Mudcat rewriting preformatted text
My previous post was an accident. I aimed at the preview box to click it, and missed and got submit instead.

It appears that the extra space is added on all lines except the first one, so the appearance that everything is aligned correctly is obtained by using <pre>[space] at the beginning so that the first line aligns with the following ones.

Unfortunately, when you copy from the mudcat page, the first space is dropped, so what you paste elsewhere has the first line "outdented" by one space - in your copy instead of in the post.

The mudcat interpreter also changes paragraph brakes to "soft breaks" (decimal char 011), and uses a normal paragraph break only at the end of a <pre> post.

In the page code, my "mistake" posted as:
<pre> What is the problem<br> With pre format<br> When posting to<br> mudcat?</pre> <P>

The program probably arises from the old html vagary about the <br>.

The ASCII/ANSI standard originally defined a linefeed, which moved down one line, but continued at the current cursor place in the line. A "carriage return" was provided as a separate character to return to the start of the line, but a "cr" remained on the same line.

In order to go to the start of the line and move down to a new line, it was necessary to have a "cr"+"lf" both.

Unfortunately, some operating systems implemented a break (<br>) as cr+lf, and others implemented it as lf+cr. This difference is in the Operating Systems, and is not really an html thing.

In order to make the <br> code work in ALL operating systems, at least for the older html interpreters it gets translated to what looks like either "lf cr lf" or "cr lf cr." HTML is, by early definitions, supposed to "ignore" any code it can't read, so the <br> always contains a "lf cr" and a "cr lf" in the <br> br, and the "extra" char – "either a lf or a cr" is "ignored," but in this case apparently appears as an extraneous "space."

This same "effect" is NOT EXCLUSIVE TO MUDCAT but appears widely on the web. The "evidence" you'll see is that at many websites, anything you copy and paste has nearly all of the "paragraph breaks" typed by the person posting replaced with "soft breaks" that appear as a "broken arrow" in Word – if you are set to display everything. Word shows the same symbol, but for some pastes it appears as a decimal 031 character and in pastes from other pages as a decimal 011 character. This "other version" of our <pre> anomaly is found in .htm pages as well as in .cfm pages like mudcat.

John