LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Desktop
User Name
Password
Linux - Desktop This forum is for the discussion of all Linux Software used in a desktop context.

Notices


Reply
  Search this Thread
Old 12-04-2023, 04:04 PM   #1
jude7
Member
 
Registered: Apr 2020
Posts: 66

Rep: Reputation: Disabled
How do I copy/paste from a searchable pdf?


I am running Mint 18.2, Cinnamon 3.4.3, and all I want to do is copy/paste from a searchable pdf. I have looked in LibreOffice writer, and xreader, and GIMP, but I don't see any way to do this....

Seems like this should be easy, but I'm not seeing it.....
 
Old 12-04-2023, 04:53 PM   #2
rkelsen
Senior Member
 
Registered: Sep 2004
Distribution: slackware
Posts: 4,469
Blog Entries: 7

Rep: Reputation: 2572Reputation: 2572Reputation: 2572Reputation: 2572Reputation: 2572Reputation: 2572Reputation: 2572Reputation: 2572Reputation: 2572Reputation: 2572Reputation: 2572
Try using LibreOffice Draw.
 
Old 12-04-2023, 05:17 PM   #3
goldennuggets
Member
 
Registered: Feb 2003
Location: USA
Distribution: Kubuntu, Manjaro
Posts: 239

Rep: Reputation: 24
Did you try Atril?
What about opening the PDF in your web browser and copying the text from there?
 
Old 12-04-2023, 05:21 PM   #4
jude7
Member
 
Registered: Apr 2020
Posts: 66

Original Poster
Rep: Reputation: Disabled
I'll try it.....
 
Old 12-04-2023, 05:41 PM   #5
jude7
Member
 
Registered: Apr 2020
Posts: 66

Original Poster
Rep: Reputation: Disabled
Ok, It doesn't really work: When I highlight the text in the pdf, only about 50% actually gets highlighted, and that is all that will "paste", so I end up with mainly letters with spaces between them, not words.

I should add that the searchable pdf was made from an online utility to convert the pdf to a supposedly searchable form, and it seemed to work fine.

maybe I need a better conversion program, because it looks as if the converter didn't recognize all the letters, but there is nothing wrong with the original text - it's clear and sharp.
 
Old 12-04-2023, 05:43 PM   #6
goldennuggets
Member
 
Registered: Feb 2003
Location: USA
Distribution: Kubuntu, Manjaro
Posts: 239

Rep: Reputation: 24
Quote:
Originally Posted by jude7 View Post
Ok, It doesn't really work: When I highlight the text in the pdf, only about 50% actually gets highlighted, and that is all that will "paste", so I end up with mainly letters with spaces between them, not words.

I should add that the searchable pdf was made from an online utility to convert the pdf to a supposedly searchable form, and it seemed to work fine.

maybe I need a better conversion program, because it looks as if the converter didn't recognize all the letters, but there is nothing wrong with the original text - it's clear and sharp.
It does sound as if the pdf document you're attempting to copy from is corrupted in some way or wasn't formatted properly. You're on the right track to go back to the source.
 
Old 12-05-2023, 02:06 AM   #7
pan64
LQ Addict
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 22,035

Rep: Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344Reputation: 7344
I don't know what kind of pdf is it, but sometimes they contain images of pages. In that case you cannot mark/select text. You can use an OCR to convert it to text (in that case).
 
Old 12-05-2023, 06:44 AM   #8
goldennuggets
Member
 
Registered: Feb 2003
Location: USA
Distribution: Kubuntu, Manjaro
Posts: 239

Rep: Reputation: 24
You might also be able to upload it to an AI service, even chatgpt or perplexity and ask it to provide you with the text.
 
Old 12-05-2023, 08:29 AM   #9
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware, Slarm64 & Android
Posts: 16,448

Rep: Reputation: 2342Reputation: 2342Reputation: 2342Reputation: 2342Reputation: 2342Reputation: 2342Reputation: 2342Reputation: 2342Reputation: 2342Reputation: 2342Reputation: 2342
Have you tried pdftotext?
 
Old 12-06-2023, 08:54 PM   #10
teckk
LQ Guru
 
Registered: Oct 2004
Distribution: Arch
Posts: 5,152
Blog Entries: 6

Rep: Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835Reputation: 1835
pdftotext works fair. If you have a two colum pdf it's worthless, or one with images. Pdf is just not a format made for that. Pdf should be an end product not a source.

Depends on how the pdf is made.


Example:

cat file1.txt
Code:
Ok, It doesn't really work: When I highlight the text in the pdf, 
only about 50% actually gets highlighted, and that is all that will 
"paste", so I end up with mainly letters with spaces between them, 
not words.

I should add that the searchable pdf was made from an online utility 
to convert the pdf to a supposedly searchable form, and it seemed to 
work fine.

maybe I need a better conversion program, because it looks as if the 
converter didn't recognize all the letters, but there is nothing wrong 
with the original text - it's clear and sharp.
Code:
libreoffice --convert-to "pdf" file1.txt
toc.txt
Code:
[/Page 1 /View [/XYZ null null null] /Title (My Bookmark) /OUT pdfmark
Code:
gs -o file2.pdf -sDEVICE=pdfwrite toc.txt -f file1.pdf
Open file1.pdf and file2.pdf in pdf viewer. All the text is copy-able.
 
1 members found this post helpful.
Old 12-12-2023, 03:25 PM   #11
jefro
Moderator
 
Registered: Mar 2008
Posts: 22,019

Rep: Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630Reputation: 3630
Can we assume the author of the work has not enabled cut?
 
1 members found this post helpful.
Old 12-12-2023, 03:58 PM   #12
jude7
Member
 
Registered: Apr 2020
Posts: 66

Original Poster
Rep: Reputation: Disabled
I'm not overly sophisticated on that sort of thing - wish I was - but since it's a screen-cap of as doc saved by me as a pdf, I would imagine the answer must be no. It's likely a pdf that's not going to offer any help.
 
Old 12-12-2023, 04:07 PM   #13
goldennuggets
Member
 
Registered: Feb 2003
Location: USA
Distribution: Kubuntu, Manjaro
Posts: 239

Rep: Reputation: 24
Quote:
Originally Posted by jude7 View Post
I'm not overly sophisticated on that sort of thing - wish I was - but since it's a screen-cap of as doc saved by me as a pdf, I would imagine the answer must be no. It's likely a pdf that's not going to offer any help.
You're going to need to use an OCR service or AI to read the text first and convert it to actual text. Then you'll be golden.
 
Old 12-12-2023, 04:11 PM   #14
jude7
Member
 
Registered: Apr 2020
Posts: 66

Original Poster
Rep: Reputation: Disabled
OK, thanks.

Not the news I wanted, but at least I won't waste more time looking for something that's not available.
 
Old 12-12-2023, 04:12 PM   #15
goldennuggets
Member
 
Registered: Feb 2003
Location: USA
Distribution: Kubuntu, Manjaro
Posts: 239

Rep: Reputation: 24
Quote:
Originally Posted by jude7 View Post
OK, thanks.

Not the news I wanted, but at least I won't waste more time looking for something that's not available.
You're welcome. There's a lot of free options out there like this one: https://www.iloveocr.com
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
The way to create searchable PDF kaz2100 Linux - Software 1 01-21-2013 08:50 PM
[SOLVED] How could I convert downloaded websites to searchable PDF's? LAPIII Linux - Software 6 01-26-2012 05:14 PM
print to searchable pdf chanpeter88 Linux - Desktop 3 12-04-2011 11:41 AM
How do I create searchable PDF's? LAPIII Linux - Software 7 10-26-2011 09:39 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Desktop

All times are GMT -5. The time now is 03:44 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration