The Mudcat Café TM
Thread #93687   Message #1806008
Posted By: JohnInKansas
10-Aug-06 - 05:11 AM
Thread Name: Tech: Wanna Make PDF files?
Subject: RE: Tech: Wanna Make PDF files?
Richard -

Your original request was -

recover the word docuemnt behind it in order to edit it?

The point here is that recovering an original document and making a new document from a PDF are two different things.

You cannot, in general, recover or even re-create the original document, even assuming that you know that it was a Word or Wordperfect document to begin with. A whole bunch of stuff in the document gets "interpreted" and the original document input thrown away when the PDF is created.

As an example, you don't know, and cannot determine from the PDF whether an indent in the original was is in a paragraph format, a tab, a series of spaces, a paragraph style, a margin change, or the result of placing part of the text in a text box or a frame. All the PDF knows is that the line starts at a specific place in page coordinates.

The original document may have had "variables" that were filled in from information in a whole bunch of external files. The PDF cannot even tell you that a particular item is a linked item or a field result. It can only incorporate the current value of the item at the time of the PDF creation.

It is relatively easy to create a new document that will print something that looks like what the PDF prints. You can also create a new spreadsheet in Excel that prints something that looks like what the PDF prints, or a new html doc that displays on the web just like the PDF looks. You cannot "back engineer" your way to re-create the original document, because the information to do so is not contained in the PDF file.

With newer versions of the Adobe PDF maker, the person who does the conversion can allow persons reading the file, using only the Adobe Reader to insert comments, and in some cases to "edit" to make corrections or changes. With older versions, only persons who used one of the more advanced Adobe programs could do these things. In most cases a PDF is used specifically to prevent readers from changing it, so the edit feature is seldom turned on except for members of a production group.

There are a lot of programs that can produce a document in another program that will "look like" the PDF. Some of them do a fair job, and some are rather disappointing. This is a much different thing than "extracting the original file," which is what you asked about.

John