Data Structures
- Created by: em1lyturner
- Created on: 30-09-18 10:46
Introduction - Static
A static data stucture has a fixed size which can't be changed while the program is running.
Advanatges:
- Easy to Program
- Easy to check for overflow
- An array allows random access
Disadvantages:
- Programmer has to estimate maximum amount of space needed
- Can waste a lot of space
Introduction - Dynamic
A dynamic data structure can change in size as the program runs.
Advantages:
- Only uses the space that's needed at any time
- Makes efficient use of the memory
- Storage no longer required can be returned to the system for other uses
Disadvantages:
- Difficuly to program
- Can be slow to implement searches
Introduction - Arrays
An array is a static data structure.
An array is a collection of one type of data such as an array of integers or an array of strings.
Arrays can have one or many dimensions.
A one dimensional array is called a list.
A two dimensional array is called a table and has columns and rows.
A multidimensional array is called a tuple
Stacks
A stack is a dynamic data structure and has a Last In First Out structure. Items are pushed (added) to the stack and popped (removed) from the stack, always from the top of stack.
A stack uses a pointer, called a stack pointer, which points to the top of the stack. The pointer is incremented when items are added and decremented when items are removed. When the stack is empty, the pointer can be set to -1 to indicate this.
Stacks can be implemented using an array and a pointer which identifies the position in the array of the last item added. Variable are also required to store the size of the stack and the maximum size.
Uses:
- Storing return addresses when procedures are called
- Checking syntax during complication
Stacks - Push Item Code
Remember to check there's space for item, if not then generate an error message due to an overflow condition.
If stack_size = max_size then
Output “Stack full”
Else
Read new_item
stack_pointer = stack_pointer +1
stack(stack_pointer) = new_item
stack_size = stack_size + 1
End If
Stacks - Pop Item Code
Remember to check the stack isn't empty, if so then generate an error message due to an underflow condition.
If stack_size = 0 then
Output “Stack empty”
Else
Output stack(stack_pointer)
stack_pointer= stack_pointer – 1
stack_size= stack_size– 1
End if
Queues
A queue is a dynamic data structure and is a First In First Out structure. Items are removed from the front and added to rear.
Queues can be implemented using a 1D array. 2 pointers are required: front_pointer and rear_pointer. 2 variables are also required: size and max.
https://www.youtube.com/watch?v=arj_9lzrpf4
Problems of a linear Queue:
As items are added and removed from the queue the items move along the data structure. This can generate overflow errors if a fixed size has been allocated to the queue. To overcome this circular queues are used.
Queues 2.
A circular queue loops back to the beginning when the end of the queue is reached. (if it is not full).
Uses of Queues:
- storing print jobs on a network; jobs need to be processed in the order they are submitted.
- storing jobs in a multi-processing system; each job needs a time slice and is processed in order first-come-first-served. If it does not complete in the time slice then it re-joins the back of the queue.
Queues - Adding to linear
Remember to check there is space, if not then generate an error message due to an overflow condition.
If queue_size = max_size then
output “Queue full”
Else
rear_pointer= rear_pointer + 1
read new_item
queue(rear_pointer) = new_item
queue_size= queue_size + 1
End If
Queues - Removing from linear
Remember to check it's not empty, if so then generate an error message due to an underflow condition.
If queue_size = 0 then
output “Empty queue”
Else
output queue(front_pointer)
front_pointer = front_pointer + 1
queue_size= queue_size – 1
End If
Queues - Adding to circular
If queue_size = max_size then
Output “Queue full”
Else
If rear_pointer = max_size then
rear_pointer= 1
Else
rear_pointer= rear_pointer + 1
End if
Read new_item
queue(rear_pointer) = new_item
queue_size= queue_size + 1
End If
Queues - Removing from circular
If queue_size = 0 then
Output “Empty queue”
Else
Output queue(front_pointer)
If front_pointer = max_size then
front_pointer= 1
Else
front_pointer= front_pointer + 1
End If
queue_size= queue_size – 1
End If
Linked Lists
A linked list is a dynamic data structure. The items of the list are not necessarily held in contiguous memory locations or in the order in which they were added to the list. Each item in the list is called a node. Each node contains the data plus a link or pointer to the next item in the list. The last item in the list has a null pointer e.g. -1. A linked list also has a variable called head or start which is a pointer to the first item in the list.
Linked lists also store avaliable memory locations or free space. When an item is added to the list the first free space is used and the head pointer to the free space list is updated. When an item is deleted from the list the location is added to the beginning of the free space list.
Linked lists can be implemented using a 2D record or structure.
Uses of linked lists:
- Management of low-level memory, the heap space, which keeps track of used and free space.
-
Applications which require data to be accessed in more than one order e.g. arrivals and departures. Often listed in chronological order but can also be listed in alphabetical etc.
Linked Lists - Adding
Part 1 – Reading the new item
read newItem
temp = nextFree
nextFree = nextFree.Link
list(temp).data = newItem
You need to check if the list is empty:
if start = -1 then
start = temp
list(temp).link = -1
finished = true
end if
First item in list:
if newItem < list(start).data then
list(temp).link = start
start = temp
finished = true
end if
Linked Lists - Removing
if start = -1 then ‘checking if the list is empty
output “list is empty”
else ‘checking it the item is first in the list
read searchItem
if searchItem = list(start).data then
nextFree(start).link = nextFree
nextFree = start
start = list(start).link
found = true
end if
current = start
remember = current
current = list(current).link
While NOT found AND current <> -1 Do
if searchItem= list(current).data Then
found = true
list(remember).link = list(current).link
else
remember = current
current = list(current).link
end If
end While
end If
Linked Lists - Adding General
current = start
while NOT finished Do
remember = current
current = list(current).link
if newItem< list(current).data then
list(temp).link = current
list(remember).link = temp
finished = true
else
if current = -1 Then
list(current).link = temp
list(temp).link = -1
finished = true
end If
end If
end While
Hash Tables
A hash table is a data structure which maps keys to index values of an array.
It uses a hashing function to calculate an index to a location in the array.
Hash functions are not perfect because they can generate the same location for more than one piece of data.
Hash tables usually use buckets to store more than one piece of data in any one location.
The purpose of the hash function is to distribute the entries across an array of buckets.
The algorithm calculates an index based on a key
index = f(key, array_size)
In order to separate the hash function from the array size this can be done in two steps.
hash = hash_function(key)
index = hash MOD array_size
Hash Tables 2
The load factor is the ratio of the number of entries in the table, n, divided by the total number of buckets, k.
load factor = n/k
As the load factor increases the access rate slows.
Ideally the load factor should always be less than 1 (remember each bucket stores more than one entry)
A collision occurs when the hash function generates the same index for more than one key, these are then stored in the same bucket.
To access the items in a bucket a linked list may be used.
With each item linked to the next one in the bucket.
This slows down access as linked lists can only be accessed linearly.
Hash Tables 3
Hash tables are relatively quick for storing and retrieving data compared to other data structures for large amounts of data.
Performance is very dependent on the load factor and collision rate.
Uses of hash tables:
Used for memory mapping:
- storing the locations of files in memory
- datanase indexes
- implementation of caches
Related discussions on The Student Room
- How to get an A* in geography NEA »
- 10 marker model response for econ? »
- Unit 2 Marketing campaign Level 3 BTEC business »
- Computer science NEA »
- Title: Crafting a Comprehensive Scientific Thesis: Integrating AI Tools and Human »
- Side Hustle - Sell unused bandwidth to companies for cash »
- How should I write the methodology section on my secondary data? »
- silver badge idea »
- MA Big Data at KCL vs MSc Social Research Methods LSE »
- Need help with civil engineering dissertation based solely on secondary data. »
Comments
No comments have yet been made