Part One

Introduction:

Sorting routines are among the most widely used and studied algorithms. Every student should know how to implement several different kinds of sorts, and should have idea about how they perform, both theoretically and practically. This programming project is designed to give the student practice in implementing and observing the behavior of four sorts: Insertion Sort, Merge Sort, Heap Sort, and Quick Sort.

Resources

The algorithms for Insertion Sort and Merge sort are given; the algorithm for Heap Sort is given; and the algorithm for Quick Sort is given.

Programs must be written in standard C, C++

You will find 12 data files for this part of the project. These files all contain shuffled lists of integers, with one integer list per line. The files are:

Filename # items lowest highest Description
Num8.txt 23 1 8 no omissions, no duplicates
Num16.txt 24 1 16 no omissions, no duplicates
Num32.txt 25 1 32 no omissions, no duplicates
Num64.txt 26 1 64 no omissions, no duplicates
Num128.txt 27 1 128 omissions/duplicates possible
Num256.txt 28 1 256 omissions/duplicates possible
Num512.txt 29 1 512 omissions/duplicates possible
Num1024.txt 210 1 1024 omissions/duplicates possible
Num2048.txt 211 1 2048 omissions/duplicates possible
Num4096.txt 212 1 4096 omissions/duplicates possible
Num8192.txt 213 1 8192 omissions/duplicates possible
Num16284.txt 214 1 16284 omissions/duplicates possible

Description:

You will write C, C++ code that implements the algorithms for sorting routines mentioned above. As part of your code, you will include counters that iterate whenever a specific line of the algorithm is executed.

Some lines in an algorithm may have a higher cost than other lines. For example, the function call in line 5 in the Merge Sort algorithm is executed only 7 times for an array of 8 elements, but the body of the Merge function which is being called has many lines, some of which are executed more than once. So the cost of line 5 in the Merge sort algorithm is higher than the other 4 lines, We can use the cost of the highest-cost line as an indicator of the cost of the algorithm as a whole.

Insertion Sort:

Here is the pseudocode for Insertion Sort, modified to include a counter:

count <- 0
Insertion_Sort(A)
1 for j <- 2 to length(A) do
2 key <- A[j]
3 // Insert A[j] into the sorted sequence A[1..j - 1]
4 i <- j - 1
5 while i > 0 and A[i] > key do
5.5 count <- count + 1
6 A[i + 1] <- A[i]
7 i <- i - 1
8 A[i + 1] <- key

Your code for Insertion Sort should have a line in it that is equivalent to line 5.5 in the Insertion_Sort pseudocode above. The global variable count will keep a running total of the number of times this line is executed. When you exit from the call to the Insertion Sort function, you should print out the values of n (the length of the array) and count as an indicator of the cost of the function.

Merge Sort:

Here is the pseudocode for Merge Sort, modified to include a counter:

count <- 0
Merge_Sort(A, p, r)
1 if p < r
2 then q <- (p + r) / 2
3 Merge-Sort (A, p, q)
4 Merge-Sort (A, q+1, r)
5 Merge (A, p, q, r)

And here is the modified algorithm for the Merge function used by Merge Sort:

Merge (A, p, q, r)
1 n1 <- (q - p) + 1
2 n2 <- (r - q)
3 create arrays L[1..n1+1] and R[1..n2+2]
4 for i <- 1 to n1 do
5 L[i] <- A[(p + i) -1]
6 for j <- 1 to n2 do
7 R[j] <- A[q + j]
8 L[n1 + 1] <- infinity
9 R[n2 + 1] <- infinity
10 i <- 1
11 j <- 1
12 for k <- p to r do
12.5 count <- count + 1
13 if L[I] <= R[j]
14 then A[k] <- L[i]
15 i <- i + 1
16 else A[k] <- R[j]
17 j <- j + 1

Your code for Merge Sort should have a line of code in it that is equivalent to line 12.5 in the Merge pseudocode above. The global variable count will keep a running total of the number of times this line is executed. When you exit from the call to the Insertion Sort function, you should print out the values of n (the length of the array) and count as an indicator of the cost of the function.

Heap Sort:

Here is the pseudocode for the Heap Sort, modified to include a counter:

count <- 0
HeapSort(A)
1 Build_Max_Heap(A)
2 for i <- length[A] downto 2 do
3 exchange A[1] <-> A[i]
4 heap-size[A] <- heap-size[A] - 1
5 Max_Heapify(A, 1)

And here is the algorithm for Max_Heapify function used by Heap Sort:

Max_Heapify(A, i)
1 l <- LEFT(i)
2 r <- RIGHT(i)
3 if l <= heap-size[A] and A[l] > A[i]
4 then largest <- l
5 else largest <- i
6 if r <= heap-size[A] and A[r] > A[largest]
7 then largest <- r
8 if largest != i
9 then exchange A[i] <-> A[largest]
9.5 count <- count + 1
10 Max_heapify(A, largest)

And here is the algorithm for the Build_Max_Heap function used by Heap Sort:

Build_Max_Heap(A)
1 heap-size[A] <- length[A]
2 for i <- floor(length[A]/2) downto 1 do
3 Max_Heapify(A, i)

Your program for Heap Sort should have a line of code in it that is equivalent to line 10 in the Max_Heapify pseudocode above. In your program, a global counter should keep track of the number of times this line is executed.

Quick Sort:

Here is the pseudocode for Quick Sort, modified to include a counter

QuickSort(A)
1 Count <- 0
2 QUICKSORT(A, 1, length[A])

Here is the pseudocode for the QUICKSORT function used by Quick Sort:

QUICKSORT(A, p, r)
1 if p < r
2 then q <- PARTITION(A, p, r)
3 QUICKSORT(A, p, q-1)
4 QUICKSORT(A, q+1, r)

Here is the pseudocode for the PARTITION function used by QUICKSORT():

PARTITION(A, p, r)
1 x <- A[r]
2 i <- p - 1
3 for j <- p to r-1
3.5 count <- count + 1
4 do if A[j] <= x
5 then i <- i + 1
6 exchange A[i] <-> A[j]
7 exchange A[i+1] <-> A[r]
8 return i+1

Program Execution:

Your program should read in data from Num8.txt, run Insertion Sort on it, and store its results; read in data from Num16.txt, run Insertion Sort on it, and store its results, etc., up through file Num16284.

Next it should repeat the process, using Merge Sort, Heap Sort, and Quick Sort as the sorting routines.

When your program terminates, you should have 48 sets of results. Each set of results should contain:

  • the value of count immediately prior the termination of the algorithm,
  • the array after having been sorted by your sort routine

Analysis

Analyze and discuss the results of your experiment. You may want to plot your results using Excel or a graphing program to help you visualize the results better. At the very least answer the following questions:

  • Discuss what your results mean regarding the theoretical run-time of the different algorithms.
  • Do the sorts really take O(n2) and O(n lg n) steps to run?
  • Explain how you got your answer to this question.
  • Which of the sorts takes the most steps?
  • Which of the O(n lg n) sorts takes the most steps?
  • Why?
  • Under what circumstances might you prefer to use one of the sorts versus others?
  • In general, which sort seems preferable?
  • Why?
Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.