Skip to content
Advertisement

Multithreaded word count in C

I know I said I would try to figure it out on my own and I really did, and then I looked elsewhere first before posting here again but then I just ended up with this mess:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>

void partition_file(int n);
void *count_words(void *pos);

int total_count = 0;
int *seg_size = 0;

int main()
{
    int file=0;
    pthread_t tid;
    if((file=open("Device-Driver.txt",O_RDONLY)) < -1)
                return 1;
    partition_file(8);

    for (int i=0; i < 8; i++)
    {
        pthread_create(&tid, NULL, count_words, (void *) seg_size);
    }
    pthread_exit(NULL);
    return 0;

}

void partition_file(int n)
{
    int file=0;
    file=open("Device-Driver.txt",O_RDONLY);
    int size = lseek(file, 0, SEEK_END);
    seg_size = size / n);
    close(file);
}

void *count_words(void *pos)
{
    int file=0;
    int p = *((int *) pos);
    char buffer[seg_size];
    file=open("Device-Driver.txt",O_RDONLY);
    lseek(file,p,SEEK_SET); 
    read(file,buffer,seg_size);     
    for(int i = 0; i < size; i++)
    {
        if(buffer[i] == " ") total_count +=1;
    }
}

How best should I fix this? Specifically these errors and warnings:

warning: assignment to ‘int *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
   36 |  seg_size = size / n);
      |           ^
error: expected ‘;’ before ‘)’ token
   36 |  seg_size = size / n);
      |                     ^
      |                     ;
error: expected statement before ‘)’ token
In function ‘count_words’:
error: size of array ‘buffer’ has non-integer type
   44 |  char buffer[seg_size];
      |       ^~~~~~
warning: passing argument 3 of ‘read’ makes integer from pointer without a cast [-Wint-conversion]
   47 |  read(file,buffer,seg_size);
      |                   ^~~~~~~~
      |                   |
      |                   int *
In file included
/usr/include/unistd.h:360:52: note: expected ‘size_t’ {aka ‘long unsigned int’} but argument is of type ‘int *’
  360 | extern ssize_t read (int __fd, void *__buf, size_t __nbytes) __wur;
      |                                             ~~~~~~~^~~~~~~~
error: ‘size’ undeclared (first use in this function)
   48 |  for(int i = 0; i < size; i++)
      |                     ^~~~
warning: comparison between pointer and integer
   50 |   if(buffer[i] == " ") total_count +=1;
      |                ^~

I think the part for counting words from the buffer is okay, but I’m not sure how the function to used by the threads should be set up and how arguments should be passed to the threads. It wants very specific variable types but I need to use other types and like I said it kind of just turned into a mess

Advertisement

Answer

lseek moves only the read pointer.
Once the reading pointer is placed in one place, you can read characters with read.


I think in your exercise, each thread must have its own reading pointer.

if the file is 800 bytes and divided into 8:

the thread 1 reads the bytes from 0 to 99,
lseek (0, SEEK_SET), read (file, buffer, 100), count words in buffer…

thread 2: lseek (100, SEEK_SET), read (file, buffer, 100), count words in buffer…

thread 3: lseek (200, SEEK_SET), read (file, buffer, 100) count words in buffer…

etc …

You do not have to create temporary files

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement