Hashtable as C++ Data Structure

Table of Contents:

  1. What is a Hash Table?
  2. What is Collision in the Hash Table?
  3. Advantages of using Hash Table
  4. Hash Table in C++

What is a Hash Table?

There are various data structures used in programming languages to store and retrieve valuable information. One of the data structures to store information is called Hash Table. Hash Table is a famous data structure that helps in keeping data in a connective manner. Data in the hash table is in an array format. Hash table uses this storage of data in the array having abstract data type. The developers use this data structure to map the keys to their values. Every key has a specific index which they calculate using a particular hash function. The index of the key generated by the hash function is also known as the hash code. This hash code or index helps in searching through the buckets or slots in the array faster to find the required value associated with it.

What is Collision in the Hash Table?

The purpose of the hash function is to compute a specific hash code or index for a given key. This method assigns the key to a particular table address on which it maps. Sometimes the hash function maps two different keys to the same table address. This phenomenon is known as a collision. The developers resolve this situation by using various methods such as linear probing. They use linear probing is a scheme in which the re-hashing technique to eradicate the collision problem. Whenever a collision occurs, it checks for the next open slot in the table for holding the key.

Resolving Collision in Hash Table

If the hash function generates the same storing index for more than one key, this condition leads to a hash collision. There are various ways to solve hash collisions. Some of these ways are following:

  • Resolution by Chaining
  • Resolution by Open Addressing
    • Linear Probing
    • Quadratic Probing
    • Double Hashing

Resolution By Chaining

If the hash function provides the same index for multiple elements, a solution is to store these elements with the same index but using a doubly-linked list instead. The main idea is to point every hash table cell to a linked list having recorded with the same hash function value. Therefore, the program can store the new value having the same index at the end of the corresponding list. For example, if “i” is the slot having multiple elements, there is a pointer associated with “i” pointing to the head of the list.

Resolution By Open Addressing

Open addressing is another way of resolving hash collisions. This technique does not store elements in the form of another data structure. It resolves collisions by inserting elements in the hash table itself. Following are the different methods of open addressing.

Linear Probing

In case of any collision, the linear probing technique resolves the conflict by looking for the next available slot. When the function generates a hash code for the key value, and that position is not occupied, then the key-value occupies that location. However, if that position is not accessible, the probing sequence helps find the next unoccupied position to insert the key value in the hash table. For example, the hash function may change in the linear probing technique according to the following equations:

hash= hash% TableSize
hash= (hash+1)% TableSize
hash= (hash+2) % TableSize
hash= (hash+3)% TableSize

Quadratic Probing

Quadratic probing is similar to linear probing, except there is a difference in the interval used for probing. This technique utilizes quadratic or non-linear distance in order to occupy slots in case a collision occurs. Here, the interval computation depends on adding an arbitrary polynomial value to the existing hash index. Assuming that a key-value already occupies the calculated index, the following are the equations that represent the technique of finding the next available slot by using quadratic probing:

hash= hash% TableSize
hash= (hash+1^2)% TableSize
hash= (hash+2^2) % TableSize
hash= (hash+3^2)% TableSize

Double Hashing

Double hashing is quite similar to the linear probing technique. However, in this case, two hash functions determine the interval between the successive probes. Therefore, in case of any collision, two hash functions calculate the next unoccupied slot to store the key value. Assuming that HV is the hash value that another hash function calculates, the following are the equations that help in the case of using double hashing:

hash= (hash+1*HV)% TableSize
hash= (hash+2*HV) % TableSize
hash= (hash+3*HV)% TableSize

Advantages of using Hash Table

Like every other data structure used in programming, the hash table holds its importance. The developers utilize it to provide a mapping between keys and values using certain functions for such purposes. Following are some of the main advantages of using a hash table data structure.

  • Using a hash table reduces the time to access the stored value. Through hash functions finding the index of stored values becomes more straightforward. It is a time-efficient method to utilize. Even in the collision scenario, the re-hashing scheme provides a relevant hash function to ensure zero collisions. Overall the time complexity of hash tables is constant O(1).
  • The hash tables are chosen over other data structures due to their speed. Insertion and searching operations are faster in hash tables. The efficiency of hash tables increases if it has predicted the maximum number of entries in advance.
  • A useful hash function provides the benefit of working at a low cost. The low-cost execution of a hash function makes it preferable over traditional methods.
  • Using hash tables helps provide a more reliable and flexible structure for the storage and retrieval of data. Searching in hash tables is way faster than arrays and lists.
  • Hash tables provide the opportunity of selecting an efficient time-space tradeoff. It can help to resolve complex problems in certain situations.

Hash Table in C++

Implementation of hash table data structure in C++ is not that complex. Its algorithm consists of forming a complete system for hashing. The first step is to choose an appropriate hash function for the given set of keys and values. The next thing is to utilize the formed hash function to perform the insert, search or delete function supported by the data structure. Moreover, another most essential thing in hashing is to resolve collision problems. Following is an example of implementing hash table data structure in C++.

Creating Hash Table in C++

Hash table stores key-value pairs in the array. A hash function is implemented to find an exact index of the key to be stored in the array. A simple example of using the hash table in c++ is following:

#include<iostream>
#include<cstdlib>
#include<string>
#include<cstdio>
#include<list>

Using namespace std:
Const int size=100;
Class TableEntry
{
            public:
            int h_key;
            int h_value;
 
            TableEntry( int h_key, int h_value)
            {
                        this -> h_key=h_key;
                        this -> h_value=h_value;
            }

};

 

The above code creates a hash table data-entry structure to represent a key-value pair. The variable size initializes the table to a specific given size. Similarly, users can create a hash map to map the keys and values. Following is the piece of code to create a hash map:

 

Class HashTableMap
{
   private:
       TableEntry ** T1;
   public:
       HashTableMap( )
       {
        T1 = new TableEntry * [size];
        for (int i=0; i<size; i++)
        {
         T1[i] = NULL;
        }
       }
};

 

This piece of code creates an initially empty hash table for the given size to store values. The users can use this HashTable Map structure to define various other operations such as insertion, deletion, and searching to perform on the values. Following is an example of utilizing this structure:

Hash Function:

int  getHashCode( int h_key)
  {
   return h_key % size;
  }

The above getHashCode() function takes a given key as its parameter and returns its specific hash value, determining its index in the storage array. Furthermore, this hash value also specifies at what index to search this particular key.

Insertion in Hash Table

void HashInsertion( int h_key, int h_value)
{
   int hc=getHashCode(h_key);
   while ( T1[hc] != NULL && T1[hc] -> h_key!= h_key )
            {
            hc= getHashCode(hc+1);
            }
              If (T1[hc] != NULL)
      {
            delete T1[hc];
              }
            T1[hc] = new TableEntry( h_key, h_value );

}

For performing insertion operation on the hash table, the developers use the getHashCode() function first to calculate the required index of the provided key for insertion in the array. The developers use linear probing to find an appropriate insertion position to avoid the collision in the mapping. The function looks for a vacant position or a deleted cell in the array till it has found it. Therefore, the function uses the constant re-hashing technique to find an empty location for the key. When it has found the appropriate cell, it uses the TableEntry() function to save the value in the T1 array.

Search in Hash Table

int SearchHashTable( int h_key){
   int hc=getHashCode(h_key);
            while ( T1[hc] != NULL && T1[hc] -> h_key!= h_key )
            {
            hc= getHashCode(hc+1);
            }
            if (T1[hc] == NULL)
      {
    cout<<” The value is not found”<<endl;
            return -1;
            }
            else
            return T1[h
c]->h_value;
}

Here, the SearchHashTable() function takes the key as its arguments and searches the value associated with that particular key. It first calculates the key’s hash address and uses it to find the desired value’s location in the array. If it cannot find the value at the calculated hash address, it uses the linear probing method to go ahead and find the next slots to locate the value. Finally, if it finds the value, it returns it the value, else it returns “-1” to indicate that the particular value does not exist.

Deletion from Hash Table

void DeleteValue( int h_key )
{
   int hc=getHashCode(h_key);
    while ( T1[hc] != NULL)
            {
       if( T1[hc]->h_key == h_key)
              {
            break;
            }
            hc= getHashCode(hc+1);
            }
            if( T1[hc] == NULL )
            {
            cout<<” The value is not found”<<endl;
            }
            else
            {
            delete T1[hc];
            }
            cout<<”The value is deleted”<<endl;
}

The above coding example demonstrates the deletion of a particular element from the hash table. The DeleteValue() function accepts the associated key as its parameter. Then, it computes the hash code of the key in the same way as above. In each iteration, it uses linear probing to find the location. The loop breaks if any iteration points to the required item. If the element is not present, it displays it on the console. Finally, if it has found the element in the array, it is removed as per requirement. The users can introduce the dummy item to avoid performance failure and make the removal operation more efficient.