Computers & Electronics

Convert scanned PDF to searchable text?

  • Last Updated:
  • Sep 14th, 2017 12:31 pm
Deal Addict
Sep 23, 2013
3967 posts
437 upvotes
NORTH YORK
bogolisk wrote:
Sep 12th, 2017 3:37 pm
Nope, those are scanned PDFs. They are images... I tried your steps with a scanned PDF (from my Canon MFP), no go! The PDF was neat, clean and straight. Evince can't select anything. Adobe-Reader selects and pastes as image. Job not done.
I stand corrected. I tried a couple of scanned pdf , "the select" function does not work.
You are right
Daniel

Fido $15 3Gb plan
free TextmeUp SMS & incoming call
Deal Addict
Jan 18, 2009
1784 posts
656 upvotes
danieltoronto wrote:
Sep 12th, 2017 5:36 pm
BTW there is no need for you to test a scanned page. Those images could be an issue. Say you save this particular web page in pdf file by issuing "print". Then you select, copy and paste. Try it.
The problem might have been caused by the fact that the scanning was not done in a perfectly straight horizontal line.
Oh that's why it "works" for you. Those PDF from printing are text-based, and fully searchable. No need to do anything with them.
"A Eruchîn, ú-dano i faelas a hyn an uben tanatha le faelas!" -- Aragorn
Hath no loyalty to any ISP, for you shall receive none!
Deal Addict
Jan 18, 2009
1784 posts
656 upvotes
mystery wrote:
Sep 12th, 2017 7:30 pm
You got it.
Gotta try-it. Paid $13.00 for the full Office suite but never use any.
"A Eruchîn, ú-dano i faelas a hyn an uben tanatha le faelas!" -- Aragorn
Hath no loyalty to any ISP, for you shall receive none!
Deal Addict
Jan 18, 2009
1784 posts
656 upvotes
mystery wrote:
Sep 12th, 2017 7:30 pm
You got it.
You're right. Word 2016 can handle scanned pdfs. However, not as well as google drive. From the same scan:

Word-2016:
These programs make the task Of the conversion easy. You can make important updates in the document With ease. A user can also reformat and change the entire document into editable format. A person can convert it back to the Portable Document FormaL The state-of-art computer programs enable the user to convert PDF to Word document in batches or select the desirable documents for conversion.
Users can easily convert the desirable documents into the editable file forntat to make use Of precious information trapped in the scanned files. You can extract the data and use it for preparing reports, projects and Other such documents.
Drive:
These programs make the task of the conversion easy. You can make important updates in the document with ease. A user can also reformat and change the entire document into editable format. A person can convert it back to the Portable Document Format. The state-of-art computer programs enable the user to convert PDF to Word document in batches or select the desirable documents for conversion.
Users can easily convert the desirable documents into the editable file format to make use of precious information trapped in the scanned files. You can extract the data and use it for preparing reports, projects and other such documents.
"A Eruchîn, ú-dano i faelas a hyn an uben tanatha le faelas!" -- Aragorn
Hath no loyalty to any ISP, for you shall receive none!
Newbie
Sep 12, 2008
24 posts
6 upvotes
Canada
Just a warning (and opinion, I can't prove it yet):

Anything you put into Google Docs, esp. if Google Docs goes through the OCR (convert to text process) is probably out there in the ether on Google's servers
and all over the planet. Google's reputation for privacy is that it doesn't exist.

I prefer local-based software which keeps everything on my hard drive. Just saying. Your mileage may vary, of course.
Deal Addict
Feb 29, 2012
2092 posts
1058 upvotes
Richmond
thewhopper wrote:
Sep 13th, 2017 10:35 pm
would this work?

https://www.onlineocr.net/
Probably. But it's free and they are promoting it pretty hard. So what's their angle? I wouldn't run any confidential documents through it.

At least if it's Google only a mega-corporation and all their employees and probably some of their contractors and the U.S. government spy agencies will be reading your confidential documents.
Deal Addict
Jan 18, 2009
1784 posts
656 upvotes
shplad wrote:
Sep 13th, 2017 10:22 pm
Just a warning (and opinion, I can't prove it yet):

Anything you put into Google Docs, esp. if Google Docs goes through the OCR (convert to text process) is probably out there in the ether on Google's servers
and all over the planet. Google's reputation for privacy is that it doesn't exist.

I prefer local-based software which keeps everything on my hard drive. Just saying. Your mileage may vary, of course.
For your personal collection of pedophilic literatures, sure don't use google. Myself, I have nothing to hide.
"A Eruchîn, ú-dano i faelas a hyn an uben tanatha le faelas!" -- Aragorn
Hath no loyalty to any ISP, for you shall receive none!
Deal Addict
User avatar
Jan 6, 2011
2979 posts
228 upvotes
GTA
Is there one that takes all CPU cores?

Acrobat and Abby Fine Reader only use one core. Huge difference when files are big.

Top