Data Structures T5 Hash Tables

Searching algorithms

If in random order
- Must use a sequential (linear) search
- If there are 10 million items, searching could take a long time
Else
- Any other search algorithm, such as binary search.
- Hash Table
  - Find any record in the dataset almost instantly
  - Items to be deleted must be replaced with a dummy marker, e.g; "N/A"
    - If a new item has to be added, and the address contains "N/A", it will replace the dummy item.
  - Address of data is calculated using the data itself via hashing algorithm
    - A simple hashing algorithm is to assign 1000 slots which have a 4 digit key
      - Address = key mod (no# slots)
    - This turns the following example data into the following addresses
      - 0287 -> 287
      - 0999 -> 999
      - 1074 -> 074
      - 9074 -> 074
      - 3287 -> 287
    - Address 074 has two pieces of data - this is known as a collison.
      - Happens when an algorithm generates the same address for different primary keys.
      - Also known as a "synonym"
      - Collisions are unavoidable.

Dealing with collisions

Simplest method is to put the item in the next free slot in the table.
- This can lead to clustering of items in the table.
Another method…

Get Revising