I have several large files (3-6 Gb) of 1’s and 0’s characters in ASCII and I would like to convert it to a simply binary file. Newlines are not important and should be discarded.
test.bin below is 568 bytes, I would like the 560 bit file.
0111000110000000101000100000100100011111010010101000001001010000111000 1001100011010100001101110000100010000010000000000001011000010011111100 0100001000010000010000010111011101011111000111111000111001100010100011 0011101000100001111111000001111110111111101101100000011000010101100001 0000000110110001000000000001000011110100000101101000001000010001010011 1101101111010101011110001110000010011001100101101101000111111101110101 1000001100101101010111110111110101100000000011001000100000000011001110 0101101001110010011110000100101001001111010011100100001001111111100110 ...
I’ve found several solutions going the other way, converting a binary file into ASCII but not the other way.
Ideally I’m looking for a simple linux / bash solution but I could live with a python solution. =================== Edit ==================
To make this less confusing consider converting any two ASCII characters into a binary file.
test_XY_encoded.txt
XYYYXXXYYXXXXXXXYXYXXXYXXXXXYXXYXXXYYYYYXYXXYXYXYXXXXXYXXYXYXXXXYYYXXX YXXYYXXXYYXYXYXXXXYYXYYYXXXXYXXXYXXXXXYXXXXXXXXXXXXYXYYXXXXYXXYYYYYYXX XYXXXXYXXXXYXXXXXYXXXXXYXYYYXYYYXYXYYYYYXXXYYYYYYXXXYYYXXYYXXXYXYXXXYY XXYYYXYXXXYXXXXYYYYYYYXXXXXYYYYYYXYYYYYYYXYYXYYXXXXXXYYXXXXYXYXYYXXXXY XXXXXXXYYXYYXXXYXXXXXXXXXXXYXXXXYYYYXYXXXXXYXYYXYXXXXXYXXXXYXXXYXYXXYY YYXYYXYYYYXYXYXYXYYYYXXXYYYXXXXXYXXYYXXYYXXYXYYXYYXYXXXYYYYYYYXYYYXYXY YXXXXXYYXXYXYYXYXYXYYYYYXYYYYYXYXYYXXXXXXXXXYYXXYXXXYXXXXXXXXXYYXXYYYX XYXYYXYXXYYYXXYXXYYYYXXXXYXXYXYXXYXXYYYYXYXXYYYXXYXXXXYXXYYYYYYYYXXYYX
Where X represents the binary 0 and Y represents the binary 1.
Advertisement
Answer
How about this bash command?
cat test.bin | tr -d 'n' | perl -lpe '$_=pack"B*",$_' > true_binary.txt
‘tr’ will delete all newline characters, and the perl command converts to binary.