Skip to content
Advertisement

Segmentation Fault in pthreads, Linux Ubuntu

I’m getting a Segmentation Fault when I run this code. Surprisingly, when I set thread_count to 16 or less, it doesn’t give any error. When I debug the code using gdb, the code gets an error at line local_answer += vec_1[j] * vec_2[j]; in the Calculate() thread function. What is the reason for this behavior? How can I fix that?

I’m compiling with this gcc command:

gcc test.c -o DP -lpthread -lm -mcmodel=large -g

And here’s the code:

#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <time.h>
#include <math.h>
#include <pthread.h>

double *vec_1 = NULL;
double *vec_2 = NULL;
int vec_length = 0;
int thread_count = 0;
double answer = 0;
double *partial_results = NULL;
pthread_mutex_t mutex;

void *Calculate(void *arg) {    
    int myId = (int) arg;
    int myStart = myId * vec_length / thread_count;
    int myEnd = (myId + 1) * vec_length / thread_count;

    double local_answer = 0;
    int j;

    for(j = myStart; j < myEnd; j++) {
        local_answer += vec_1[j] * vec_2[j];
    }
    pthread_mutex_lock(&mutex);
    partial_results[myId] = local_answer;
    pthread_mutex_unlock(&mutex);
}

int main(int argc, const char *argv[]) {    
    srand((unsigned int) time(NULL)); 
    pthread_mutex_init(&mutex, NULL);

    int num_iterations = 5;

    vec_length = 1000000000;  
    thread_count = 25;

    partial_results = (double*) malloc(thread_count * sizeof(double)); 

    double avg_time = 0;
    int i;

    vec_1 = (double*) malloc(vec_length * sizeof(double)); 
    vec_2 = (double*) malloc(vec_length * sizeof(double));

    if(vec_1==NULL || vec_2==NULL){
        printf("Memory Allocation failed");
        exit(0);
    }

    int j;
    for (j = 0; j < vec_length; j++) {
        vec_1[j] = ((double) rand() / (double) (RAND_MAX)) + 1;
        vec_2[j] = ((double) rand() / (double) (RAND_MAX)) + 1;             
    }

    for (i = 0; i < num_iterations; i++) {
        pthread_t threads[thread_count];
        pthread_attr_t attr;
        void* status;       
        struct timeval t1, t2;
        gettimeofday(&t1, NULL);

        pthread_attr_init(&attr);
        pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);

        int t;
        for (t = 0; t < thread_count; t++) {            
            if (pthread_create(&threads[t], NULL, Calculate, (void*)(t))) {
                printf("ERROR in pthread_create()");
                exit(-1);
            }
        }

        pthread_attr_destroy(&attr);

        answer = 0;
        for (t = 0; t < thread_count; t++) {            
            if (pthread_join(threads[t], &status)) {
                printf("ERROR in pthread_join()");
                exit(-1);
            }
            answer += partial_results[t];
        }

        gettimeofday(&t2, NULL);     
        avg_time += (t2.tv_sec - t1.tv_sec) * 1000.0 + (t2.tv_usec - t1.tv_usec) / 1000.0;
    }

    printf("Average time Spent : %lf n", avg_time / num_iterations);

    pthread_mutex_destroy(&mutex);
    return 0;
}

Advertisement

Answer

Your vec_length has type int. With gcc on Linux x86 or x86_64, int is represented in 32-bit two’s complement format. This is sufficient to accommodate the value you’re using for vec_length, 1,000,000,000, but not to accommodate most integer multiples of that value. You compute several such multiples, and the resulting overflow of a signed integer formally produces undefined behavior.

In practice, it is likely that gcc’s actual behavior upon signed integer overflow is reproducible. In that case, you can write a program to demonstrate for yourself that the results are negative for several small-integer multiples of your vector length. Where that occurs, your program will attempt to access outside the bounds of each of the two vectors, at the line where indeed the error is indicated, with a segfault being a likely result. (And even if the overflow results were not reproducible, obtaining a negative result for some of those undefined multiplication behaviors would still be well within the realm of possibility.)

You have several alternatives, among them:

  • use a wider data type for your indexing computations

    int myStart = myId * (int64_t) vec_length / thread_count;
    
  • use only thread_count values that evenly divide the vec_length, and use parentheses to ensure that the division is performed first in your indexing computations

    int myStart = myId * (vec_length / thread_count);
    // ...
    vec_length = 1000000000;
    thread_count = 32;  // or 10 or 8 or 1000
    

A few other things:

  • The code presented does not use any math.h functions. It therefore does not need to #include math.h, and you do not need to link in libm.
  • To compile a Pthreads program with GCC, you ought to use the -pthreads flag, in which case you also do not need to explicitly link in libpthread.
  • As discussed in comments, you do not need the complication of a pthread_attr_t.
  • As discussed in comments, your particular use of a mutex is an unnecessary performance drain.
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement