Lyrics & Knowledge Personal Pages Record Shop Auction Links Radio & Media Kids Membership Help
The Mudcat Cafesj



User Name Thread Name Subject Posted
cnd Tech: OCR Tips - Optical Character Recognition (22) RE: Tech: OCR Tips - Optical Character Recognition 18 Jan 22


I've had mixed success using www.onlineocr.net -- it's free and technically has a cap on how many images it can do per hour, but I've only hit that limit once, when trying a new method where I broke the OCR into dozens of smaller pictures.

In general, it does a passable job, but can be really hit-or-miss. Anything with marginal patterns or pictures in the background is a no-go, and like I hinted at in the beginning, longer pages tend to lower its accuracy. If the surface isn't glossy, it works much better on pictures taken using flash. I like to get a setup with an overhead light, a lamp, and natural light if possible and have 2/3+ light sources to help illuminate well and minimize shadows and glare.

When I say it can be hit or miss, I mean that it does a fine job 80% of the time, but sometimes it will get a text totally wrong -- as in, not a single legible word. This can be the case even when every other image from the set renders fine. But this isn't a very common issue and I think tends to be more likely if the image I'm trying to OCR isn't ideal.

As a side note, I've discovered recently that Google Drive auto-OCRs images and non-searchable text documents (ie scanned PDFs). This is great for finding things, but you have to format your stuff correctly. It will only tell you the document the word came up in, but not which page -- as a result, ever document I wanted searchable I broke up into single-page TIFFs or PDFs, and then I just have to read the one page rather than a large document.


Post to this Thread -

Back to the Main Forum Page

By clicking on the User Name, you will requery the forum for that user. You will see everything that he or she has posted with that Mudcat name.

By clicking on the Thread Name, you will be sent to the Forum on that thread as if you selected it from the main Mudcat Forum page.

By clicking on the Subject, you will also go to the thread as if you selected it from the original Forum page, but also go directly to that particular message.

By clicking on the Date (Posted), you will dig out every message posted that day.

Try it all, you will see.