Skip to content
Advertisement

spawn strace command with Node.js child_process

Since I’m not happy with this approach and I got no answers, I’m trying another method to track the output of an already running program. I based this code on this Unix Stack Exchange and what I’m trying to do is just retrieve the log information of a program that is already running.

Obs: to use strace without sudo you need to allow it with the following command:

echo kernel.yama.ptrace_scope = 0 > /etc/sysctl.d/10-ptrace.conf

And probably it will work only after reboot.

This is my code using strace linux command and node child_process, where -p1234 is the number of the process we want to track.

  import { spawn } from 'child_process';
  const strace = spawn('strace', [ `-p1234`, '-s9999', '-e', 'write']);

  strace.stdout.on('data', (data) => {
    // I don't know why, but my output is not being returned here
    console.log(`stdout: ${data}`);
  });

  strace.stderr.on('data', (data: any) => {
    // 'data' output is something like: 'write(4, "my real log", 4)                = 4n',
    // so we need to cleanup a little bit with regex.
    const prefix = /^write(1, "/
    const suffix = /", [0-9])?s*=?s?[0-9]?/
    const raw = `${data}`.trim()

    if (raw.match(prefix) && raw.match(suffix)) {
      // cleaning msg
      let log = raw.replace(prefix, '').trim().replace(suffix, '').trim();
      if (!log.includes('n')) {
        // showing msg: "my real log"
        console.log(log);
      }
    }
  });

First of all: I don’t why my log is being outputed on stderr and not on stdout, but ok, I follow coding anyway.

Second: It works pretty fine if the program I’m running is outputing logs slowly, like one print at a time.

But when I have a sequence of 2, 3 or 4 prints it’s not working. I guess I,m taking too much time to cleanup the message, so when the other strace.stderr.on() comes, I’m in the middle of the process and some issues happen. In this case, I’m able to see only the first print of the sequence.

I need to clean a lot of info because the message is outputed as something like:

'write(4, "9883", 4)                     = 4n'

Which is a mess. I think maybe a command that print only content of message would fix my problem too.

Any idea on how can I obtain a more consistent output of those messages? As I said, my only purpose is to retrieve the messages that are being output by another program that is already running. I also tried to follow the “log on file and read the file” approach, but as I said, I had some other issues

Any advice, improvement, or totally different approach to achieve this goal will be very much appreciated!

Advertisement

Answer

First, my reading of the strace(1) man page suggests that stderr is the default for that command (look for --output in that page).

The output that you’re getting from strace in the data event is not guaranteed to be split on newline boundaries. When you’re getting slow output, it will just happen to work correctly. Faster output will clump multiple lines together and may split lines at non-newline boundaries. For a quick fix, pipe stderr to an instance of readline:

const rl = require('readline').createInterface({input: strace.stderr})

Once you have that done, if you need further help parsing the output, you can use a better regular expression to parse the whole line at once:

rl.on('line', line => {
  const re = /^write((?<fd>d+),s*"(?<data>(?:[^"]|\")*)",s*(?<len>d+))s*=s*(?<written>d+)/
  const m = line.match(re)
  if (m) {
    console.log(m.groups.data)
  }
})

That uses the new named capture groups, so if you’re using a very old version of node, remove the portions that look like ?<fd> and instead of m.groups.data use m[2].

This also assumes that a double-quote in the strace output gets output as ", but I don’t know if that’s actually the case.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement