From Newsgroup: comp.text.pdf
On 3 Apr 2024 14:29:40 GMT, Stefan Ram wrote:
ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:
Text in PDFs is sometimes compressed. So one can either use programs
like "Agent Ransack" to search for text in PDFs or tools like
"pdftotext" to first create a text file for every PDF file and then grep >>those text files.
PS: "Agent Ransack" is Windows software. "pdftotext" is also available
for Linux. Converting all PDFs to text files needs to be done only
once, and then search operations on those text files are faster than
scanning the PDF files for text on every search!
I should maybe have elaborated a bit. Sometimes I
remember a certain phrase or word but forget which
pdf it is in. With text files I can do
grep blabla *.txt
and I wanted an equivalent. Using pdftotext would
mean using it for every suspect pdf. Since a lot of
pdf files are searchable, I figured that such a
command might exist.
But if there really is a pdfgrep command, that might
do the job. I will do some googling, thanks.
--- Synchronet 3.21d-Linux NewsLink 1.2