Skip to content
Advertisement

Understanding getaddrinfo function in C

I’m new to C and socket programming, just a question on getaddrinfo function. The function prototype of getaddrinfo is:

int getaddrinfo(const char *host, const char *service, const struct addrinfo *hints, struct addrinfo **result);

and getaddrinfo returns a result that points to a linked list of addrinfo structures, each of which points to a socket address structure that corresponds to host and service. enter image description here

Below is my questions:

Q1-Why it needs to return a result that points to a linked list of addrinfo structures? I mean that given a host and service, these is only one unique socket address, how could it be more than one valid socket address so that a linked list is needed?

Q2-the last parameter is struct addrinfo **result, why it is a pointer to a pointer? Why it is not struct addrinfo *result, and then getaddrinfo creates sth internally and let result(struct addrinfo *) point to it? someone says it is due to getaddrinfo call malloc internally, but I can also code like this

int main() 
{
   char *ptr_main;
   test(ptr_main);
   free(ptr_main);
}

void test(char * ptr)
{ 
    ptr = malloc(10); 
}

so the parameter to the function is char *ptr, not char **ptr.

Advertisement

Answer

getaddrinfo() returns a list of address because an hostname can have more than an address. Think for example to those high traffic sites that need to distribute visitors through different IPs.

Since getaddrinfo()

combines the functionality provided by the gethostbyname(3) and getservbyname(3) functions into a single interface, but unlike the latter functions, getaddrinfo() is reentrant and allows programs to eliminate IPv4-versus-IPv6 dependencies

it might trigger a DNS session to resolve an host name. In case of those aforementioned high traffic sites the same hostname will correspond to a list of actual addresses.


You also ask:

struct addrinfo **result, why is it a pointer to a pointer?

In C a pointer to something is passing as a parameter of a function when it has to modify it. So, for example, in case you need to modify an integer, you pass int *. This particular kind of modification is very common in C when you want to return something through a parameter; in our previous example we can return an extra integer by accessing the pointer passed as a parameter.

But what if a function wants to allocate something? It will result, internally in a type * var = malloc(), meaning that a pointer to type would be returned. And in order to return it as a parameter we need to pass a type ** parameter.

Is it clear the logic? Given a type, if you wat to return it as a parameter you have to define it as a pointer to type.

In conclusion, in our case the function getaddrinfo needs to modify a variable which type is struct addrinfo *, so struct addrinfo ** is the parameter type.

Just to mention the meaning of this parameter:

The getaddrinfo() function allocates and initializes a linked list of addrinfo structures, one for each network address that matches node and service, subject to any restrictions imposed by hints, and returns a pointer to the start of the list in res. The items in the linked list are linked by the ai_next field.

As you can see we actually have an allocation inside the function. So this memory will need to be eventually freed:

The freeaddrinfo() function frees the memory that was allocated for the dynamically allocated linked list res.


Why not just a type * parameter?

Your code example results in undefined behavior, and when I run it it caused a program crash.

Why? Is I wrote above, in C parameters are passed by value. It means that in case of a func(int c) function, called in this way

int b = 1234;

funct(b);

the parameter c internally used by the function will be a copy of b, and any change on it won’t survive outside the function.

The same happens in case of func(char * ptr) (please note the huge spacing, to underline how the type is char * and the variable is ptr): any change on ptr won’t survive outside the function. **You’ll be able to change the memory it points, and these changes will be available after the function returns, but the variable passed as a parameter will be the same it was before the call to func().

And what was the value of ptr_main, in your example, before test was called? We don’t know, as the variable is uninitialized. So, the behavior is undefined.

If you still have doubts, here it is a program that demonstrates that the newly allocated address obtained by value cannot be accessed from outside the function:

#include <stdlib.h>
#include <stdio.h>

void test(char * ptr)
{ 
    ptr = malloc(10); 

    printf("test:t%pn", ptr);
}

int main() 
{
    char *ptr_main = (char *) 0x7777;

    printf("main-1:t%pn", ptr_main);
    test(ptr_main);
    printf("main-2:t%pn", ptr_main);
}

Output:

main-1: 0000000000007777
test:   0000000000A96D60
main-2: 0000000000007777

Even after the function call the value of ptr_main is the same it after I initialized it (0x7777).

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement