In my python code I need to get the list of “physical” WiFi network devices connected to Raspberry Pi
I’ve been doing this by calling:
raw_output = check_output('iw dev', shell=True)
and then extracting all the data I need from raw_output
It works ok, but in iw help
it says that Do NOT screenscrape this tool, we don't consider its output stable.
Is it really unsafe to get this data the way I did it? If yes, what is the correct way to do this?
Advertisement
Answer
What is meant by “Do NOT screenscrape this tool, we don’t consider its output stable” is that as new releases of iw
will be made, the output formating may change. So the developers of iw
warn you that if you write software depending on the parsing of its output, it may break on future releases of iw
.
Take the example of the venerable ifconfig
command. For many many years, its output used to be formated like so:
eth0 Link encap:Ethernet HWaddr 00:80:C8:F8:4A:51 inet addr:192.168.99.35 Bcast:192.168.99.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:190312 errors:0 dropped:0 overruns:0 frame:0 TX packets:86955 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:30701229 (29.2 Mb) TX bytes:7878951 (7.5 Mb) Interrupt:9 Base address:0x5000
And though it was considered stable (even deprecated and unmaintained by some), it changed a couple of years ago and now looks like this:
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.1.67 netmask 255.255.255.0 broadcast 192.168.1.255 inet6 fe80::8e89:a5ff:fe57:103c prefixlen 64 scopeid 0x20<link> ether 8c:89:a5:57:10:3c txqueuelen 1000 (Ethernet) RX packets 2219946 bytes 3178868967 (2.9 GiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 1241676 bytes 102998523 (98.2 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
…so let’s say I did some soft which look at the MAC address by searching the string following “HWaddr”. Nowadays it would be broken, because it should look for the string following “ether” instead.
But as long as you don’t update iw
, or perform regular testing of you work, you should not encounter any problem.
It is anyway always inherently a bit fragile to parse the output of a third part tool, you just have to be aware of it. For instance, the output may depend on the LOCALE setup by the user. Real life example, some scripting I did with the output of ifconfig
failed on some users environment. Root cause: here is what the output look like in French locale:
eth0 Lien encap:Ethernet HWaddr 00:FF:F2:58:32:A1 UP BROADCAST MULTICAST MTU:1500 Metric:1 Packets reçus:0 erreurs:0 :0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 lg file transmission:1000 Octets reçus:0 (0.0 b) Octets transmis:0 (0.0 b) Interruption:23 Adresse de base:0x2000
Notice the French “Packets reçus”, “erreurs”, and “Octets reçus” instead of “RX packets”, “errors”, and “RX bytes”.
EDIT:
So:
Is it really unsafe to get this data the way I did it?
Not really. You just have to keep in mind that your software depends on the output strings of some third part software that is somewhat out of your control and may change in the future. That will be regular testing and maintenance job for you, nothing tragic, that’s software life.
If yes, what is the correct way to do this?
Again, “no”, but if you want to be bulletproof to that: do not depend on the textual output of third part software. This usually involve writing your own code to replace these tools, which can be quite a task. And if to do so, you use some third part libraries, well, library API change over time too… 🙂
EDIT 2:
In your case, to not depend on the output of iw
(i.e. write your own “mini iw”), and considering you want to code in Python:
At the low level iw
, writtent in C, uses libnl
(in C too) to communicate with the kernel to get information/perform actions on network interfaces.
https://www.infradead.org/~tgr/libnl/
You’re lucky: it seems there is an activaly maintained version of a Python libnl
library.
https://pypi.python.org/pypi/libnl/0.2.0
So the plan would be:
- Study iw C source code to get and idea of the netlink/libnl operations it performs to get/set the parts you’re interested in.
- Use the Python libnl in your code to replicate that.
(Be warned the libnl/netlink is designed as a very generic, long-term extensible mechanism. And it really is designed with that goals, to replace ad-hoc ioctl. With that genericity comes a certain complexity: it can be quite complicated/involve a lot of coding to perform even simple tasks.)
As I wrote above doing your own code to replace a tool can be quite a task. grep’ing the output of a command is a matter of minutes to code, whereas here this may be days or weeks of work. So you have to make a choice between the “quick and easy but not so clean” and the “self-contained, clean, extensible but expensive”. It depends: do you work to produce an industrial-grade, customer supported software, is it an internal company tool, or just a week-end hobby software project for fun.