For this lab, you will create a Hash Table Abstract Data Type (ADT) that will be automatically tested and evaluated. Your ADT must adhere to the course coding conventions and must be made as generic as possible through the use of void pointers and function pointers.

As you reach the end of the lab tutorial section, you will be asked to complete the Lab Project 2: Hash Table. This is part of course assessment; further details are provided following the Lab Tutorial section.

Hash Table ADT Concept

During this lab tutorial, we will walk through the design of pseudo-code implementation of the Hash Table ADT. This will include instruction on the design of fundamental Hash Table operations.

A hash table is a data structure that provides quick insert and lookup operations. Conceptually, it is similar to a set of numbered lockers, and a list that indicates the name (or id number) of the person who is using each locker. Imagine further that the name, or id number, of each person had been altered to contain their locker number so that given a name, the locker number is immediately knowable. The name or id number is known as the key, and the contents of the locker would be the value in this contrived example. A hash table ADT is based on this same operational paradigm. Additional reference material for hash tables can be found in the References (Associative Arrays and Hash Tables sections).

A Hash Table is a data structure in which data is stored sequentially in a table. A hash table typically stores key-value pairs of data, where the key is used to determine the location of the data in the table and the value is the actual data of interest. Unlike a linked list or an array, a Hash Table provides a convenient method for quickly storing and accessing data without having to search or sort the entire container for the value. The best case for the hash table storing and accessing data is a constant time of O(1) and in the worst case, it is the same algorithm time as a linked list or array O(n).

Figure 2.1: Hash table. see image.

Description of Figure 2.1

A Hash Table is a data structure that stores data elements based on the combination of the input data and a hash function. The hash table relies on a hash function to manipulate the value of the key to create a direct mapping (or 1-to-1 mapping) to a location in the hash table. A hash function can be any function that converts a key into a positional value or hashed index. Many different hash functions can be written to facilitate the 1-to-1 mapping. The primary considerations for hash functions are a) efficient calculation of index value and b) repeatability. A hash function must always calculate the same index given the same key. For more information about hash functions please consult the Hash Functions (part of Hash Tables section) in the References.

Hash functions may map two keys to the same position in the table. Often there are more data elements than positions in the table, making such a collision inevitable. When two data elements hash to the same position it is referred to as a collision. Hash collisions require a collision resolution strategy. Several different collision handling strategies exist including Open Addressing and Separate Chaining. Please review the materials in the References to ensure that you understand both types of collision resolution. The hash table for this lab will be designed using a collision resolution strategy called separate chaining. Separate chaining stores any hash collisions in a linked list at the same index in the table.

Figure 2.2: Concept of hash table separate chaining. see image.

Node Structure

A Hash Table node is a structure composed of a hashed key, the data element, and a List to handle collision resolution. The list contains all the data elements that collided at a specific index. The data in the node stores information relevant to your programs such as integers, strings, or a structure. The hashed key is produced based on your hash function and the data element.

A hash table is typically based on an array. The length of the array is known as the table size. Since we are creating a hash table that uses separate chaining, our table will store nodes that are similar to the nodes stored by the List ADT. Because we want our hash table to be able to store any kind of data, the node must be a C structure with a void pointer for the actual data. The node also requires a next pointer and an integer for the key value. While a hash tables key value can be any type of variable, for this lab we will restrict the key to an integer value. Hash functions typically work better with integer keys. A common strategy for alphanumeric keys is to precondition them to create an integer keys. More information about preconditioning can be found in the References.

Hash Table Fundamental ADT Operations

Node Construction

The first step in creating a Hash Table is to develop the Node structure used internally by the data structure. As previously mentioned, the Node structure will be composed of the following variables:

void * data;
int key;
Node* next;

During the initialization of a Node structure, these three variables will need to be set. Throughout this lab tutorial, we will reference the Hash Table API download zipped header file which provides a description of the functions in the header file. The header of your Hash Table ADT must be exactly the same as the Hash Table API to ensure your program will work with the testing application. The following function describes the initialization of the Node structure:

// Create a new Node, set data and pointers.
Node* initializeNode( int key, void* data )
create a pointer to a new Node
set next node's next pointer to null
set the node's key value to the parameter value
set the node's data to the parameter data value
return the new Node

To recreate this function, the structures data member must be assigned to the functions parameter value, i.e., set your nodes data member to the value being passed into the function.

Hash Table

The Hash Table ADT is a data structure which creates a table of values, each position in the table contains the previously created Node structure. Since the Hash Table is abstract it does not know in advance which type of data it will contain, thus function pointers can be passed into the Hash Table initialization function. To initialize the table, you must set the size of the table, allocate memory for the table and store function pointers that indicate how to handle deletion, printing, and hashing of data elements:

// Hash Table Structure Elements
int tableSize
Node** table
int (*hashFunction)( int tableSize, int key)
void (*deleteFunction)(void *toBeDelete)

The createTable function initialize the hash table with proper function pointers, the Hash Table structure is referred to as HTable throughout the code.

// Allocates properties of the htable.
HTable * createTable( int tableSize,
int (*hashFunction) (int tableSize, int key),
void (*deleteFunction)(void *toBeDeleted),
void (*printNode)(void* toBePrinted) )
{
Create and allocate space for a new HTable of tableSize

set delete function to the delete function pointer parameter
set hash function to the hash function parameter
set printNode function to the printNode function parameter

return the new HTable
}

Destruction

In order to delete the Hash Table, you must traverse the hash table one node at a time and delete the data in the node and any additional collision nodes in the table. The Node** structure is an array of arrays that must be removed and the delete function pointer called on data element.

int destroyTable ( HTable * htable )
{
for each element in the htable table
set tempNode to the head of the current element's nodeList

while tempNode != null
set tempDeleteNode to tempNode
set tempNode to tempNode Next
free tempDeleteNode data
free tempDeleteNode

free htable node table
free htable pointer
}

Data Insertion

In order to insert data into the hash table, first check that the node at that index is not empty or null. If empty, simply create and set a new node at index in the table. If a node already exists at the table index location, this indicates a collision has occurred and the search must continue to identify a place to insert the data element in the collision list.

void insertData( HTable * htable, int key, void * toBeAdded )
{
if htable is not null
create node for key and data
set index to hashFunction(tableSize, key )

if htable table[index] exists
// Check the collision list first element, else check the rest
if htable table[index] key equals key
replace value with new node
else
search the collision list for the last position
add node to the collision list.
else
set htable table[index] to the new node.
}

Data Removal

Removing data from the Hash Table is very similar to inserting an element. To remove data from the Hash Table, use the hash function to generate the hash key to retrieve the position in the table. If the hash value exists in either the first node or any of the collision nodes in the node list, remove it.

void removeData(HTable* htable, int key )
{
if htable is not null
set index to hashFunction( tableSize, key )

if htable hashTable[index] exists

set node to htable table[index] list head
set prev to null
while node key is not equal to key
set prev to node
set node to node next

if node key is equal to key
if prev is null
// Head of the list is removed
set htable table[index] next to node next
free node
if hashTable[index] is null
free nodeList
else
set prev next to node next
free node
}

Searching

Searching for a value in a hash table requires using the hash function to calculate a key, using that hash key to search a specific location in the table. To find the data element, first, check the nodes key value, then check every node in the collision list.

{
if htable is not null
set index to hashFunction( tableSize key )

if htable table[index] does not exist
return null;

set tmpNode equal to htable htable[index]
while( tmpNode not null )
if tmpNode key equals key
return tmpNode data
tmpNode equal to tmpNode next

return null
}

Additional Material

This lab has focused on the core Hash Table operations. There are additional operations that can be added to the Hash Table ADT such as isEmpty() or a check to see if a data element already exists in the hash table. Please read the Hash Table API carefully to ensure your Hash Table library meets all the requirements of the API.

Overview

For this lab project, you will implement the source code for the Hash Table API. When you have completed this lab project, you should successfully be able to:

  • Describe the design of the Hash Table ADT;
  • Explain common operators of the Hash Table ADT;
  • Analyze the algorithmic complexity of Hash Table operators;
  • Assess the benefits of different hash function designs; and
  • Construct a Hash Table library based on pseudo-code in this tutorial, using the provided API (.h file).

Instructions

Your implemented Hash Table API must match exactly that the Hash Table API download zipped/compressed file provided, as the interface defines communication with test suite. The test suite for this lab project will be automated, multiple tools will evaluate the quality of your code under a set of test cases.

Academic Honesty!
It is not our intention to break the school's academic policy. Posted solutions are meant to be used as a reference and should not be submitted as is. We are not held liable for any misuse of the solutions. Please see the frequently asked questions page for further questions and inquiries.
Kindly complete the form. Please provide a valid email address and we will get back to you within 24 hours. Payment is through PayPal, Buy me a Coffee or Cryptocurrency. We are a nonprofit organization however we need funds to keep this organization operating and to be able to complete our research and development projects.