Data Structures

Introduction - Static

A static data stucture has a fixed size which can't be changed while the program is running.

Advanatges:

Easy to Program
Easy to check for overflow
An array allows random access

Disadvantages:

Programmer has to estimate maximum amount of space needed
Can waste a lot of space

1 of 19

Introduction - Dynamic

A dynamic data structure can change in size as the program runs.

Advantages:

Only uses the space that's needed at any time
Makes efficient use of the memory
Storage no longer required can be returned to the system for other uses

Disadvantages:

Difficuly to program
Can be slow to implement searches

2 of 19

Introduction - Arrays

An array is a static data structure.

An array is a collection of one type of data such as an array of integers or an array of strings.

Arrays can have one or many dimensions.

A one dimensional array is called a list.

A two dimensional array is called a table and has columns and rows.

A multidimensional array is called a tuple

3 of 19

Stacks

A stack is a dynamic data structure and has a Last In First Out structure. Items are pushed (added) to the stack and popped (removed) from the stack, always from the top of stack.

A stack uses a pointer, called a stack pointer, which points to the top of the stack. The pointer is incremented when items are added and decremented when items are removed. When the stack is empty, the pointer can be set to -1 to indicate this.

Stacks can be implemented using an array and a pointer which identifies the position in the array of the last item added. Variable are also required to store the size of the stack and the maximum size.

Uses:

Storing return addresses when procedures are called
Checking syntax during complication

4 of 19

Stacks - Push Item Code

Remember to check there's space for item, if not then generate an error message due to an overflow condition.

If stack_size = max_size then

Output “Stack full”

Else

Read new_item

stack_pointer = stack_pointer +1

stack(stack_pointer) = new_item

stack_size = stack_size + 1

End If

5 of 19

Stacks - Pop Item Code

Remember to check the stack isn't empty, if so then generate an error message due to an underflow condition.

If stack_size = 0 then

Output “Stack empty”

Else

Output stack(stack_pointer)

stack_pointer= stack_pointer – 1

stack_size= stack_size– 1

End if

6 of 19

Queues

A queue is a dynamic data structure and is a First In First Out structure. Items are removed from the front and added to rear.

Queues can be implemented using a 1D array. 2 pointers are required: front_pointer and rear_pointer. 2 variables are also required: size and max.

https://www.youtube.com/watch?v=arj_9lzrpf4

Problems of a linear Queue:

As items are added and removed from the queue the items move along the data structure. This can generate overflow errors if a fixed size has been allocated to the queue. To overcome this circular queues are used.

7 of 19

Queues 2.

A circular queue loops back to the beginning when the end of the queue is reached. (if it is not full).

Uses of Queues:

storing print jobs on a network; jobs need to be processed in the order they are submitted.
storing jobs in a multi-processing system; each job needs a time slice and is processed in order first-come-first-served. If it does not complete in the time slice then it re-joins the back of the queue.

8 of 19

Queues - Adding to linear

Remember to check there is space, if not then generate an error message due to an overflow condition.

If queue_size = max_size then

output “Queue full”

Else

rear_pointer= rear_pointer + 1

read new_item

queue(rear_pointer) = new_item

queue_size= queue_size + 1

End If

9 of 19

Queues - Removing from linear

Remember to check it's not empty, if so then generate an error message due to an underflow condition.

If queue_size = 0 then

output “Empty queue”

Else

output queue(front_pointer)

front_pointer = front_pointer + 1

queue_size= queue_size – 1

End If

10 of 19

Queues - Adding to circular

If queue_size = max_size then

Output “Queue full”

Else

If rear_pointer = max_size then

rear_pointer= 1

Else

rear_pointer= rear_pointer + 1

End if

Read new_item

queue(rear_pointer) = new_item

queue_size= queue_size + 1

End If

11 of 19

Queues - Removing from circular

If queue_size = 0 then

Output “Empty queue”

Else

Output queue(front_pointer)

If front_pointer = max_size then

front_pointer= 1

Else

front_pointer= front_pointer + 1

End If

queue_size= queue_size – 1

End If

12 of 19

Linked Lists

A linked list is a dynamic data structure. The items of the list are not necessarily held in contiguous memory locations or in the order in which they were added to the list. Each item in the list is called a node. Each node contains the data plus a link or pointer to the next item in the list. The last item in the list has a null pointer e.g. -1. A linked list also has a variable called head or start which is a pointer to the first item in the list.

Linked lists also store avaliable memory locations or free space. When an item is added to the list the first free space is used and the head pointer to the free space list is updated. When an item is deleted from the list the location is added to the beginning of the free space list.

Linked lists can be implemented using a 2D record or structure.

Uses of linked lists:

Management of low-level memory, the heap space, which keeps track of used and free space.
Applications which require data to be accessed in more than one order e.g. arrivals and departures. Often listed in chronological order but can also be listed in alphabetical etc.

13 of 19

Linked Lists - Adding

Part 1 – Reading the new item

read newItem

temp = nextFree

nextFree = nextFree.Link

list(temp).data = newItem

You need to check if the list is empty:

if start = -1 then

start = temp

list(temp).link = -1

finished = true

end if

First item in list:

if newItem < list(start).data then

list(temp).link = start

start = temp

finished = true

end if

14 of 19

Linked Lists - Removing

if start = -1 then ‘checking if the list is empty

output “list is empty”

else ‘checking it the item is first in the list

read searchItem

if searchItem = list(start).data then

nextFree(start).link = nextFree

nextFree = start

start = list(start).link

found = true

end if

current = start

remember = current

current = list(current).link

While NOT found AND current <> -1 Do

if searchItem= list(current).data Then

found = true

list(remember).link = list(current).link

else

remember = current

current = list(current).link

end If

end While

end If

15 of 19

Linked Lists - Adding General

current = start

while NOT finished Do

remember = current

current = list(current).link

if newItem< list(current).data then

list(temp).link = current

list(remember).link = temp

finished = true

else

if current = -1 Then

list(current).link = temp

list(temp).link = -1

finished = true

end If

end While

16 of 19

Hash Tables

A hash table is a data structure which maps keys to index values of an array.

It uses a hashing function to calculate an index to a location in the array.

Hash functions are not perfect because they can generate the same location for more than one piece of data.

Hash tables usually use buckets to store more than one piece of data in any one location.

The purpose of the hash function is to distribute the entries across an array of buckets.

The algorithm calculates an index based on a key

index = f(key, array_size)

In order to separate the hash function from the array size this can be done in two steps.

hash = hash_function(key)

index = hash MOD array_size

17 of 19

Hash Tables 2

The load factor is the ratio of the number of entries in the table, n, divided by the total number of buckets, k.

load factor = n/k

As the load factor increases the access rate slows.

Ideally the load factor should always be less than 1 (remember each bucket stores more than one entry)

A collision occurs when the hash function generates the same index for more than one key, these are then stored in the same bucket.

To access the items in a bucket a linked list may be used.

With each item linked to the next one in the bucket.

This slows down access as linked lists can only be accessed linearly.

18 of 19

Hash Tables 3

Hash tables are relatively quick for storing and retrieving data compared to other data structures for large amounts of data.

Performance is very dependent on the load factor and collision rate.

Uses of hash tables:

Used for memory mapping:

storing the locations of files in memory
datanase indexes
implementation of caches

19 of 19

Get Revising

Data Structures

Introduction - Static

Introduction - Dynamic

Introduction - Arrays

Stacks

Stacks - Push Item Code

Stacks - Pop Item Code

Queues

Queues 2.

Queues - Adding to linear

Queues - Removing from linear

Queues - Adding to circular

Queues - Removing from circular

Linked Lists

Linked Lists - Adding

Linked Lists - Removing

Linked Lists - Adding General

Hash Tables

Hash Tables 2

Hash Tables 3

Comments

Similar Computing resources: