For this assignment, your program is to read War and Peace (a text copy is included with this assignment) and it is to count and tally each of the words that are 6 or more characters long. And again, you will only use Linux (not library) file functions, i.e. open, close, read, lseek, pread.

BUT, it is to do this using threads. Each thread will take a chunk of the file and process it, returning it's results to the main which tallies (or if you directly tally to shared memory, that is okay) and then the main will print the ten, 6 or more character, words with the highest tallies, in order highest to lowest, and their associated counts. i.e. The top ten words and the number of times that word appears in the text. Remember that this assignment will be using the pthread functions.

Your program should take two parameters on the command line: FileName and ThreadCount

  • FileName is the name of the file to read - WarAndPeace.txt
  • ThreadCount is the number of threads you should spawn to evenly divide the work.

That is to say - if the parameter is 1, the entire file would be read and processed by one thread. If the parameter is 5, then you would divide the file into 5 equal parts (accounting for rounding on the last part). So thread one would take the first 1/5 of the file, thread 2 the second fifth and so on. But, these threads should all be launched together in a loop. So that they can execute in parallel.

#include < time.h> in your code and in main, include the code below in your main. This will display how much time your program takes. Your submission should include a run with 1 thread, 2 threads, 4 threads, and 8 threads. How do the times compare?

struct timespec startTime;
struct timespec endTime;
clock_gettime(CLOCK_REALTIME, &startTime);

< YOUR CODE IN MAIN HERE>

clock_gettime(CLOCK_REALTIME, &endTime);
time_t sec = endTime.tv_sec - startTime.tv_sec;
long n_sec = endTime.tv_nsec - startTime.tv_nsec;
if (endTime.tv_nsec < startTime.tv_nsec)
{
--sec;
n_sec = n_sec + 1000000000L;
}

printf("Total Time was %ld.%09ld secondsn", sec, n_sec);

There is a template main with this code provided. (Don't forget to rename it to the naming conventions)

Hints Do not forget to protect critical sections. Make sure you use thread safe library calls. You will need to know how long the input file is, look up lseek.

Do a short writeup in PDF format that includes a description of what you did, issues, resolutions and an analysis of the results. Explain and reflect on why the times for the different runs are what they are, how does each run compare with the others. Also include the compilation and execution output from your program in the writeup. Your execution output should include at least 4 runs, 1 thread, 2 threads, 4 threads, and 8 threads.

Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.