it2051229 CSci 2002 Sorting

Notes

Why so much attention to sorting? Sorting is the process of ordering a set of items. The two most common orderings are ascending order and descending order. In the context of data structures, it is the process of arranging nodes in an order based on the contents of one of the fields in the nodes.

For example, the nodes that comprise a collection of telephone listings are usually sorted in ascending order based on the contents of the name field, while the results of a weight-lifting contest would possibly be sorted in descending order based on the contents of the maximum weight lifted field.

Most often, nodes are sorted for one of two reasons:

To produce sorted output listings.
To improve the speed of subsequent use of a data structure.

When nodes are output in sorted order, not only are they more pleasant to read, but it is also much easier to find a particular listing from among the printed listings. Drawing conclusions based on the data presented in sorted order often becomes self-evident. Consider again the output listing of a weight-lifting contest that is sorted based on the maximum weight lifted field. It would be obvious who won the competition because the analysis necessary to determine the winner is the sorting process itself.

Often the speed of the fetch method can be improved if the data set is stored in a sorted order. Consider the binary search algorithm to locate a node which is an efficient, O(log2n), search algorithm. Thus, even if the data set is never output in sorted order, sorting algorithms are important because they can speed up one or more of the basic operations performed on a data set.

The additional time needed to sort can be held to acceptable levels through the use of efficient sorting algorithms and good sorting strategies. As an example of a sorting strategy, consider the following scenario.

When a data set is to be output in sorted order, two alternatives are available: sort the nodes just prior to outputting them, or store the nodes in the data set in a sorted order. The former approach is attractive if the nodes are rarely output in sorted order, or if the field on which the nodes are sorted changes often. If, however, the nodes are often output in sorted order, then it may be more efficient to store the nodes in sorted order, especially if the contents of the field on which they are sorted rarely changes.

You are studying several of the classic sorting algorithms and examine the advantages of each. Some of the algorithms require more memory than others, while some execute faster than others. It can be shown, however, that there is a theoretical minimum number of comparisons required to sort a set of n items. That's a fair standard for goodness when analyzing the performance of the algorithms presented in Chapter 8. Some data sets exhibit special characteristics that allow some sorting algorithms to perform faster than the theoretical minimum.

Aside from speed and memory differences, some of the algorithms are much easier to code than others. Thus, the selection of the best sorting algorithm for a particular application is usually based on the speed and memory constraints of the application, the nature of the data set being sorted, and the coding skills of the programmer.

Memory Overhead

When you work in the industry, you will sometimes deal with huge data sets. Examples from banking are easy to comprehend and are often used. Let's, however, consider the engineering world. Applications that work day and night, like traffic lights at the intersection, deal with just a few images from the cameras to calculate who will be giving green light at the next switch. On the other hand, a large automated weather-forecasting system (similar other real-time artificial intelligence systems) needs to quickly sort millions of values produced by radars and satellite cameras.

The Bubble Sort

When this algorithm is presented with an array of primitive values or an array of object references, the primitive values or the references are swapped within the array. Thus, the only extra storage required to perform the algorithm is one extra memory cell (temp), used in the classic swapping algorithm, temp = items[t]; items[t] = items[b]; items[b] = temp;

The memory overhead of this algorithm is the lowest overhead of the sorting algorithms.

The Merge Sort

The memory overhead associated with this sorting algorithm is essentially the storage associated with the n element temporary array. Each of its elements stores either a primitive value (the items being sorted) or a reference to the objects being sorted. It means the overhead with be n times higher than that found for Bubble Sort.

Quicksort

Storage is required to keep track of the ends of the partitions. Considering the number of passes needed, as explained in your textbook, the storage amount would be in the order of O(log₂n).

Programming Project

Create an application that does the following:

Prompts the user to enter integers until the user indicates that they are done.
Stores the integers in a linked list.
Sorts the linked list.
Prints out the sorted list of integers.

Optional Task

Create an application that does the following:

Prompts the user to enter two sets of integers.
Determines if the two sets are equal and prints out the result. Note that the sets are not ordered, that is sets {1, 3, 5} and {5, 1, 3} are equal.
- Feel free to use sorting where appropriate.

Academic Honesty!

It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.

Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.