The Mudcat Café TM
Thread #93687   Message #1807548
Posted By: JohnInKansas
11-Aug-06 - 04:57 PM
Thread Name: Tech: Wanna Make PDF files?
Subject: RE: Tech: Wanna Make PDF files?
Snuffy -

I don't know whether it's true of all scanners, but the ones I've seen make a PDF only as a "graphic image" and a text document scanned-to-PDF is just pictures. You can't search for text in one of these, so far as I know. This is also what you get from image editing programs like Photoshop or Photoshop Elements that allow you to "Save As PDF." Even if you type text in one character at a time, the PDF is just a "picture" and the individual characters are not identifiable in the PDF - (except via OCR).

The method is commonly used by people who sell technical publications and university theses. They scan each page to get an image and then just make each image a "page" in the PDF. The PDF files they sell are too expensive for me, so I've seen very few of them to experiment with. This also often results in extremely large files. A "text based PDF" of a dozen pages could rarely be more than 300 KB - 500 KB if there's a lot of "formatting," while a dozen pages of one of these "image PDFs" might run to a couple of MB or more.

An example is at Geneva Bible where you can download a separate PDF of each book; but they're "image PDFs" that are not searchable in any "look for text" manner.

When the document is converted directly from a word processor to PDF the characters in the text remain text, and you usually can copy the text from the PDF and paste it into a another word processor, and you'll have a text-based PDF document that you can edit and search in.

The freeware programs listed in the first post here, and many others, use a "print to PDF" method that should preserve the text. You install a "PDF printer." Any program that can print, can be used to print to that "printer" and the result is a PDF file "printed to" your drive instead of a bunch of paper that comes out of a physical printer. Since the program sends character names to the printer the "dot" that gets printed remains an identifiable character in the file that's produced. When an image is sent, the "dot" is a whole image, that is embedded in the PDF.

If you get the "real Adobe" program, even in the simplest PDF making versions, you should get an entry in your Office programs, on toolbars and in menus, that allow you to create PDFs directly from the program just by clicking. With one of the free programs, you usually click on Print, and then select the "PDF Printer" that does the conversion to a PDF file. (You can also set up a "PDF Printer" with the Adobe programs, for programs that can print but that can't add the menu items.)

Conversion from PDF to something else, without a program, depends on whether the person who created the PDF blocked copying. If you can select text (with the text tool) in the PDF you usually can copy text from the PDF and paste into a word processor. You may need to separately use the "graphic select" tool to copy pictures, which can be separately pasted into another program. IF you use the graphics selection tool to copy and paste into another program, all you'll paste usually(?) is "pictures of the text" even if the original PDF contained "character based" text that was copied.

Photoshop Elements, and probably many other graphic editing programs, can import a page from a PDF, but only one page at a time, and all you get is a "picture" of the page, which is not easily editable.

John