Skip to content
Advertisement

Finding the number of bytes of entered string at runtime

I’m new at learning assembly x86. I have written a program that asks the user to enter a number and then checks if it’s even or odd and then print a message to display this information. The code works fine but it has one problem. It only works for 1 digit numbers:

JavaScript

It does not work properly for numbers with more than 1 digit because it only takes in account the first byte(first digit) of the entered number so it only checks that. So I would need a way to find out how many digits(bytes) there are in the entered value that the user gives so I could do something like this: ;Convert the variable to a number and check if even or odd

mov eax, [myvariable+(number_of_digits-1)]

And only check eax which contains the last digit to see if it’s even or odd. Problem is I have no ideea how could I check how many bytes are in my number after the user has entered it. I’m sure it’s something very easy yet I have not been able to figure it out, nor have I found any solutions on how to do this on google. Please help me with this. Thank you!

Advertisement

Answer

You actually want movzx eax, byte [myvariable+(number_of_digits-1)] to only load 1 byte, not a dword. Or just directly test memory with test byte [...], 1. You can skip the sub because '0' is an even number; subtracting to convert from ASCII code to integer digit doesn’t change the low bit.

But yes, you need least significant digit, the last (highest address) in printing / reading order.

A read system call returns the number of bytes read in EAX. (Or negative error code). This will include a newline if the user hit return, but not if the user redirected from a file that didn’t end with a newline. (Or if they submitted input on a terminal using control-d after typing some digits). The most simple and robust way would be to simply loop looking for the first non-digit in the buffer.

But the “clever” / fun way would be to check if [mybuffer + eax - 1] is a digit, and if so use it. Otherwise check the previous byte. (Or just assume there’s a newline and always check [mybuffer + eax - 2], the 2nd-last byte of what was read. (Or off the start of the buffer if the user just pressed return.)

(To efficiently check for an ASCII digit; sub al, '0' / cmp al, 9 / ja non_digit. See double condition checking in assembly / What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa?)


Just for fun, here’s a more compact version that always just checks the 2nd-last byte of the read() input. (It doesn’t check for being a digit, and it reads outside the buffer for input lengths of 0 or 1, e.g. pressing control-D or return.) Also for read errors, e.g. redirect with strace ./oddeven <&- to close its stdin.

Note the interesting part:

JavaScript

I used cmov, but a simple branch over a mov ecx, msg_odd would work. You don’t need to duplicate the whole setup for the system call, just run it with the right pointer and length. (ECX and EDX values, and I padded the odd message with a space so I could use the same length for both.)

And this is a homebrewed static_assert(msg_odd.len == msg_even.len), using NASM’s conditional directives (https://nasm.us/doc/nasmdoc4.html). It’s not just a separate preprocessor like C has, it can use NASM numeric equ expressions.

JavaScript

The full thing. I outside of the part shown above, I just tweaked comments to sometimes simplify when I thought it was too redundant, and used meaningful label names.

Also, I put .rodata and .bss at the top because NASM complained about referencing msg_odd.len before it was defined. (You previously had your strings in .data, but read-only data should generally go in .rodata, so the OS can share those pages between runs of the same program because they stay clean.)

Other fixes:

  • Linux/Unix uses 0xa line endings, n not nr.
  • stdin is fd 0. 2 is stderr. (2 happens to work because terminal emulators normally run the shell with all 3 file descriptors referring to the same read+write open file description for the tty).
JavaScript

assemble + link with nasm -felf32 oddeven.asm && ld -melf_i386 -o oddeven oddeven.o

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement