Skip to content
Advertisement

What would be an optimal way to perform DNS query from bash in python3?

I have this simple bash script that I’m thinking about incorporating into my python project as I couldn’t figure out a graceful way to do this in python3 compared to this single bash oneliner. Is there a better way to do this in python3 or a library that would assist with storing all legitimate unique hostnames in a list or dictionary?

I’ve tried doing something along the lines of,

test = []
try:
    with open("dig.log", "r") as d:
        for line in d:
            parsed_lines = line.rstrip()
            if not parsed_lines.startswith(";"):
                test.append(parsed_lines.split())
except FileNotFoundError as fnf_error:
    print(fnf_error)

which outputs,

10.10.10.10.in-addr.arpa. 604800 IN     PTR     ns1.yowhat.sup

10.10.10.in-addr.arpa.  604800  IN      NS      ns1.yowhat.sup

ns1.yowhat.sup.         604800  IN      A       10.10.10.10    

with a bunch of blank lines. I couldn’t figure out how to gracefully strip() all the blank lines and only return the unique hostnames in python. I can get the exact functionality with a single bash oneliner as follows:

grep -v ";" $dig_file | sed 's/.$//g' | sed -r '/^s*$/d' | sed -n -e 's/^.*PTRt//p; s/^.*NSt//p; s/^.*MXt//p; s/^.*CNAMEt//p; s/^.*TXTt//p' | sort -u >$output_file_name

Which will output,

ns1.yowhat.sup

to a file. The helper bashscript that i’m using in my python program is,

#!/usr/bin/env bash

dig_file=$1
output_file_name=$2
NICE='e[1;32;92m[+]e[0m'

parse_dig() {
    echo -e "${NICE} parsing dig queries to find hostnames ya dig?"
    grep -v ";" $dig_file | sed 's/.$//g' | sed -r '/^s*$/d' | sed -n -e 's/^.*PTRt//p; s/^.*NSt//p; s/^.*MXt//p; s/^.*CNMAEt//p; s/^.*TXTt//p' | sort -u >$output_file_name
}
parse_dig

Which i would then call in my python project doing something like,

subprocess.call("./parse_dig dig.log host_names.log", shell=True)

How can I do what my simple bash script helper script does in python3 so as not to require using a bunch of bash scripts to parse output from files? Would it make more sense to not use

subprocess.call("dig command | tee dig.log" , shell=True)

and do something like,

dig_output = subprocess.check_call("dig command...", shell=True, STDERR=subprocess.STDOUT)

and then somehow parse the dig_output in python or what would be the most elegant, pythonic, ideal way to do this in python3?

Advertisement

Answer

You’ll need this to run dig from python and catch the output:

from subprocess import PIPE, Popen

def cmdline(command):
    process = Popen(
        args=command,
        stdout=PIPE,
        shell=True
    )
    return process.communicate()[0]

After that things get quite easy:

>>> dig_output = [i.strip() for i in cmdline( 'dig google.com ns' ).split('n')] 
>>> dig_filtered = [i.split() for i in dig_output if len(i) > 10]
>>> domains = [i[-1] for i in dig_filtered if i[-2] in ['PTR', 'MS', 'NS', 'CNAME', 'TXT']]
>>> domains
['ns1.google.com.', 'ns2.google.com.', 'ns4.google.com.', 'ns3.google.com.']
>>> 
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement