Skip to content
Advertisement

Unix cat command problems [closed]

I started a command line online course last night. I was playing around with some basic commands and for some reason, every time I use cat mydoc.txt or mydoc.docx, it only outputs question marks and other random symbols to the terminal. I searched this site and google and can’t find an exact solution to this specific problem. I came across a couple of sites that said maybe try changing the file permission but that didn’t seem to affect the output.

Any insight is appreciated!

Advertisement

Answer

A .doc file contains binary [8 bit bytes in the range 0x00-0xFF] bytes that MS word knows how to handle. It has many internal subsections, tables, etc.

When you cat it to a terminal, it’s just a stream of binary bytes. The terminal program tries to interpret this as text. It will try to use UTF8 encoding of unicode, which has special variable length [1-4 bytes] UTF “characters” that are called “codepoints”.

Not all 1-4 byte sequences produce valid UTF8 codepoints. When the terminal program finds a non-codepoint sequence, it outputs a ?.

Otherwise, the terminal program will try to output what it thinks is the correct character. This might be a german vowel with an umlaut over it. Or, a character in the Chinese character set.

That’s what you get if the terminal program has the particular character set/font set loaded. If the given set is not available, the terminal program will [again] output a ?

Note that all of this is just “best effort” by the terminal program to “interpret” as text, what is, in reality, just a random binary sequence. It’s similar to trying to interpret cat /usr/bin/cat, which is a binary file that really has no text to speak of.

If the file you cat is just a simple text file [or utf8 encoded], what you did will work. To see, use a simple text file and do (e.g.) cat /etc/passwd. Or, do echo abc > /tmp/foo and then cat /tmp/foo

Of course, if your goal was just to open the .doc under linux/*BSD, etc., there are programs that understand these files. Of note: libreoffice is a full open source suite of programs similar to MS office and what you would want is LibreOffice Writer. If you’ve got a standard distro installed (e.g. ubuntu or fedora), it will probably already be installed.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement