These are my test files:
std-test.sh:
#!/bin/bash for i in {0..20} do number=$RANDOM let "number %= 10" if [ $number -le 6 ] then echo $i; else echo $i 1>&2; fi done
process.sh:
#!/bin/bash while read line; do [[ $1 = "err" ]] && echo "$(date +%s%3N) $line" >> stderr; [[ $1 = "out" ]] && echo "$(date +%s%3N) $line" >> stdout; done
std-test.sh
creates twenty lines containing their line number, and process.sh
is able to read another commands output when piped to it, while it saves to stderr
when err
is passed to it as argument, and to stdout
when out
is passed. The command I use to test the whole thing is:
./std-test.sh 2> >(./process.sh err) > >(./process.sh out)
(taken from here). My test outputs are as follows:
stdout:
1486297129986 0 1486297129987 1 1486297129988 2 1486297129988 3 1486297129989 4 1486297129990 6 1486297129991 7 1486297129991 8 1486297129992 9 1486297129993 10 1486297129993 11 1486297129994 12 1486297129995 14 1486297129995 17 1486297129996 18
stderr:
1486297129986 5 1486297129987 13 1486297129987 15 1486297129988 16 1486297129989 19 1486297129990 20
When I tried parsing the output, I realized, that the order of stderr
and stdout
is completely messed up. Why does 5 in stderr
come before 1 in stdout
, for instance? Or 13 right after 1? What’s the mistake I made here, and how could I resolve it?
Edit #1: I know I could pipe stderr and stdout directly to files, but process.sh
will curl the results to a remote server. Saving it to files is just kind of a POC.
Edit #2: I am actually trying to timestamp and process docker logs programmatically.
Advertisement
Answer
Update:
From the comments it appears OP was really trying to timestamp and log the output from an application running in a docker container.
docker logs -t
already does this and is the best tool for job.
Original answer:
Probably because you’re timestamping each line with the time it is processed and not the time they’re actually generated?
You launch 3 processes when you run that command, not all of them run in parallel.
The processes aren’t running in a truly parallel fashion, they can get preempted whenever the kernel feels like. Because the stdout is longer/processing more lines, it is more likely to get preempted and then resumed at a later point. Which would explain the timestamping.
What do you really need the timestamps on the lines for? Do you simply want ordering between the lines, or do you want real true timestamps?
Edit: as @Sven Festersen pointed out in the comments, it also might be because stdout is buffered when piped by default.