Skip to content
Advertisement

PHP run linux “less” command via exec – binary file warning

I have to convert some PDF files to TXT. I end up with “less” command, because for example pdftotext has some problems with tables in PDF. The problem is that when I ran the command from exec function (or shell_exec/system), less just showing me information, that selected PDF is binary file and result file is just TXT with PDF data in it. But when I do the same thing normally in terminal, everything is ok. I also tried to login as www_data user and ran command as this user, but there is also no problem.

Command:

$ less /var/www/original.pdf > /var/www/new.txt

PHP code:

exec("less -f /var/www/original.pdf > /var/www/new.txt 2>&1");

Result from PHP exec:

"/var/www/original.pdf" may be a binary file.  See it anyway?

The “-f” option in exec command is there because then you don’t need to press “y” for “yes, I want to see it anyway.”

set | grep less yields:

LESSCLOSE='/usr/bin/lesspipe %s %s'
LESSOPEN='| /usr/bin/lesspipe %s'
            Lossless LZW RLE Zip' -- "$cur" ));
                _apport_parameterless
                _apport_parameterless
                _apport_parameterless
                _apport_parameterless
_apport_parameterless () 

Advertisement

Answer

From what I read, your console is able to display a PDF file with less because you have an input preprocessor installed, like lesspipe or lessfile. The way to make less use those preprocessor is by reading an environment variable called LESSOPEN, which points to the lesspipe and lessfile script.

There might be a way your webserver, through environment variables and shell commands, might be able to replicate this behavior so that your calls to less parse PDFs properly.

What I would suggest would be to call a bash script to do the conversion for you instead of calling less directly. That way, your bash script would be able to set the appropriate environment variables and execute the appropriate commands to convert your PDF files to a readable output.

Here’s an example of how to do this:

#!/bin/bash

eval $(lesspipe)
less $1 > $2 2>&1

Then, from PHP, call that script like this:

exec("/path/to/your/script/script.sh /var/www/original.pdf /var/www/new.txt");

If it doesn’t work, try changing eval $(lesspipe) to eval $(lessfile).

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement