Convert PDF images to text

By Jskid ยท 6 replies
Jan 10, 2014
Post New Reply
  1. I tried searching a pdf for a word and it didn't find it. I then realized it's not actually text just an image of text. How can I convert the pdf to have text and be able to search it?
  2. jobeard

    jobeard TS Ambassador Posts: 11,173   +989

    Try Foxit Reader
    Jskid likes this.
  3. B00kWyrm

    B00kWyrm TechSpot Paladin Posts: 1,436   +37

    I have been using Omnipage for quite a while.
    It is a commercial application (ie not freeware),
    but has several advantages over most freeware that I have tried.
    One of them is the ability to handle multiple files at once, and multiple pages per file.
    I find it to be highly accurate in its ocr,
    and though I do not really care for Nuance (the current owner of the product),
    I do recommend the program.
    Jskid likes this.
  4. Jskid

    Jskid TS Guru Topic Starter Posts: 346

    I should specify, there are some tables and formulas and images that I want preserved as images. For example can Omnipage automatically take a section of the pdf and preserve it as in image if no characters are recognized? The pdf I have is several hundred pages long and going through each one manually if Omnipage can't interpret the writing (because it isn't writing) would be a nightmare.
  5. Jskid

    Jskid TS Guru Topic Starter Posts: 346

    I can't find the OCR button anywhere. I tried googling for instructions but the current version has a new interface I can't figure out how to navigate.
  6. SNGX1275

    SNGX1275 TS Forces Special Posts: 10,742   +422

    For the images... What I would do is have whatever run OCR on the entire document. It will very likely screw up the tables beyond hope. So just delete that portion out of your OCR'd product and use the Clipping Tool in Vista/7/8 to cut out your images from the original document, then paste them into your new one. If you are still on XP, use Alt+Prt Scr and then open up your favorite image editor (I like Irfanview) and crop it.
  7. B00kWyrm

    B00kWyrm TechSpot Paladin Posts: 1,436   +37

    As suggested by SNGX1275,
    Omnipage may not render tables accurately... though it will try.
    I think it does a fair job with most layouts.
    Rendering plain text is pretty darned good.
    If tables need to be individually tweaked, I have been able to do that manually to my satisfaction.
    On the other hand, if you want tables preserved as graphics,
    that can be done too, just by manually changing the zone type.
    Unfortunately, I do not think there is a trial version of Omnipage.
    I like it. It has served my needs very well. BUT, ymmv.

Similar Topics

Add your comment to this article

You need to be a member to leave a comment. Join thousands of tech enthusiasts and participate.
TechSpot Account You may also...