I recently switched to another working machive and faced a problem when I work with Cyrillic. My Bash script gets new messages from an application and works with them.
However the messages are mostly written in Cyrillic and I get results like “u043fu0440u0438u0432u0456u0442”
On my old system that run Ubuntu I could easily convert it to normal letters with echo -e
[18:18 deimos@nc ~] echo -e "u043fu0440u0438u0432u0456u0442" привіт
Unfortunatelly that does not work with the new system on CentOS 6.
[15:21] [server1.nichan.net ~] # echo -e "u043fu0440u0438u0432u0456u0442" u043fu0440u0438u0432u0456u0442
Both systems are on English. The CentOS one was installed just today so there is not much on it. The only thing I installed on it so far is pip and some Python modules my scripts require, so it’s safe to say the system is fresh.
Also, other Unicode symbols seem to work just fine. The only issue is Cyrillic:
[15:21] [server1.nichan.net ~] # echo -e "xE2x98xA0" ☠
Any help would be greatly appreciated.
UPD: It appears my Bash was outdated. I had 4.1 and this function requires at least 4.3. I updated Bash using this guide:
wget https://ftp.gnu.org/pub/gnu/bash/bash-4.3.tar.gz tar xvfz bash-4.3.tar.gz cd bash-4.3/ ./configure make ls -la bash cp -f bash /bin/bash /bin/bash
Advertisement
Answer
Support for Uxxxx
Unicode literals in the arguments to echo
and printf
were added in bash
4.2.
From the change log:
This document details the changes between this version, bash-4.2-alpha, and the previous version, bash-4.1-release.
[…]
New Features in Bash
[…]
d. $’…’, echo, and printf understand uXXXX and UXXXXXXXX escape sequences.