The Mudcat Café TM
Thread #83207   Message #1528374
Posted By: JohnInKansas
25-Jul-05 - 11:26 PM
Thread Name: Hi Max: Personal edit button?
Subject: RE: Hi Max: Personal edit button?
Jon -

I don't know where the info quoted came from, but many "edit" facilities, as used in Word processors for example, retain all of the information in the original, place any new text in a trailer, and place pointers within the text to say "where to go next." This can result in file sizes that are at least the sum of all the versions, and that can get pretty badly "twisted" - and hence slow - in playback.

In older versions of Word, at least, if you "allow fast saves" in your setup, what gets saved is the original document with an appended record of changes in the trailer (end of the file). It was not uncommon to see a "fast save" version 2 or 3 times the size of the actual current document - which you'd get if you did a normal full save so that the changes were actually inserted in place and all the pointers dropped out. This may still be the case in current Word versions, but I learned to prohibit fast saves several versions ago, so I haven't looked.

New versions of Word do allow extensive "change tracking" and if you turn it on, all prior versions are contained in the current document, along with identification of who made each change and when it was made. This is closer to what's done in most database information systems.

Since most web servers are set up using database information, it is most likely that an edit is a new record. The pointer to the data item that is a "post" can be switched from the original record to point to the new one that replaces it, but the old record is seldom actually removed. Deletion merely means that there's no longer a pointer to tell you to look at it - but it remains on the server, at least until some sort of "purge" is done. Even those who don't deal much with database files may see this in an email program, where you need to periodically "compact files" to get rid of the dead records.

The actual amount of excess disk space required for edits is probably insignificant relative to the traffic in new posts, but allowing large numbers of gratuitous edits can significantly increase the number of "dead records." As in the case of the recent mudcat data server troubles, if you need to reconstruct things and/or correct link errors, the presence of even a few such unlinked records can horribly complicate recovery, since you have to determine whether there's no pointer to a record because it was deleted (edited) or because the pointer wasn't recovered in the reconstruction.

I'm not sure I'd agree specifically with the statement that editing makes the files a lot larger, but it certainly can make them a lot more complex and a lot more trouble to maintain.

Maybe John Hardly can provide more information, but I don't have any problem putting "no edits" on the "good side" and "edits" on the "not so good side" for what look to me like sound technical reasons. Just because you can write a simple program to add a feature doesn't enter too strongly in the argument about whether you should add a feature.

John