Data Structures T5 Hash Tables


Searching algorithms

  • If in random order
    • Must use a sequential (linear) search
    • If there are 10 million items, searching could take a long time
  • Else
    • Any other search algorithm, such as binary search.
    • Hash Table
      • Find any record in the dataset almost instantly
      • Items to be deleted must be replaced with a dummy marker, e.g; "N/A"
        • If a new item has to be added, and the address contains "N/A", it will replace the dummy item.
      • Address of data is calculated using the data itself via hashing algorithm
        • A simple hashing algorithm is to assign 1000 slots which have a 4 digit key
          • Address = key mod (no# slots)
        • This turns the following example data into the following addresses
          • 0287 -> 287
          • 0999 -> 999
          • 1074 -> 074
          • 9074 -> 074
          • 3287 -> 287
        • Address 074 has two pieces of data - this is known as a collison.
          • Happens when an algorithm generates the same address for different primary keys.
          • Also known as a "synonym"
          • Collisions are unavoidable.

Dealing with collisions

  • Simplest method is to put the item in the next free slot in the table.
    • This can lead to clustering of items in the table.
  • Another method…


No comments have yet been made