Convert PDF images to text

Jskid

Posts: 348   +1
I tried searching a pdf for a word and it didn't find it. I then realized it's not actually text just an image of text. How can I convert the pdf to have text and be able to search it?
 
I tried searching a pdf for a word and it didn't find it. I then realized it's not actually text just an image of text. How can I convert the pdf to have text and be able to search it?
I have been using Omnipage for quite a while.
It is a commercial application (ie not freeware),
but has several advantages over most freeware that I have tried.
One of them is the ability to handle multiple files at once, and multiple pages per file.
I find it to be highly accurate in its ocr,
and though I do not really care for Nuance (the current owner of the product),
I do recommend the program.
 
I have been using Omnipage for quite a while.
It is a commercial application (ie not freeware),
but has several advantages over most freeware that I have tried.
One of them is the ability to handle multiple files at once, and multiple pages per file.
I find it to be highly accurate in its ocr,
and though I do not really care for Nuance (the current owner of the product),
I do recommend the program.
I should specify, there are some tables and formulas and images that I want preserved as images. For example can Omnipage automatically take a section of the pdf and preserve it as in image if no characters are recognized? The pdf I have is several hundred pages long and going through each one manually if Omnipage can't interpret the writing (because it isn't writing) would be a nightmare.
 
For the images... What I would do is have whatever run OCR on the entire document. It will very likely screw up the tables beyond hope. So just delete that portion out of your OCR'd product and use the Clipping Tool in Vista/7/8 to cut out your images from the original document, then paste them into your new one. If you are still on XP, use Alt+Prt Scr and then open up your favorite image editor (I like Irfanview) and crop it.
 
As suggested by SNGX1275,
Omnipage may not render tables accurately... though it will try.
I think it does a fair job with most layouts.
Rendering plain text is pretty darned good.
If tables need to be individually tweaked, I have been able to do that manually to my satisfaction.
On the other hand, if you want tables preserved as graphics,
that can be done too, just by manually changing the zone type.
Unfortunately, I do not think there is a trial version of Omnipage.
I like it. It has served my needs very well. BUT, ymmv.
 
Back