SCD - 2.3.1 - Algorithms

?

T1 - Design of Algorithms

Algorithm - Set of rules to be followed in calculations or other problem-solving operations, especially by a computer

Problems solved by algorithms - Packet routing, Timetabling, Internet searches, Sorting, Writing a compiler

Features of a good algorithm - Clear and precisely stated steps that produce the correct output for any set of valid inputs, allow for invalid inputs, terminates at some point, execute efficiently in as few steps as possible, designed so other people can understand or modify it.

The number of assignment statements in an algorithm is a good measure of the efficiency of the algorithm i.e. a loop that loops 1000 times is less efficient than doing the same calculation in a single line

A linear function takes the form f(n) = an + b where a and b are constants e.g. f(n) = 3n or f(n) = 6n + 1

 f(1) is a constant function - no matter the size of n, f(1) stays the same size, as a result, the order of magnitude for a linear function is f(n)

1 of 19

T1 - Order of Magnitude/Big-O Notation

For a quadratic in the form f(n) = an2 + bn + c:

  • As n becomes large the n2 term increases much faster than either of the other terms, so when n is so large you can ignore the other terms and so the order of magnitude of a quadratic function is O(n2)

For any function, the dominant term is used to compare algorithms

For a logarithmic function in the form f(n) = alog2n:

  • f(n) increases very slowly in relation to n (Log base 2 is used as in computing there are only 2 inputs, 1 and 0) and so the order of magnitude of a logarithmic function is O(log n)

Big-O notation is a measure of the time complexity of an algorithm used to approximate the time taken to execute an algorithm for a given number of steps in a data set

2 of 19

T1- Logarithmic functions

'Divide and Conquer' algorithms are examples of Logarithmic functions so has time complexity O(log n)

Permutations

A permutation of n items is the number of ways n items can be arranged, the two types are:

  • Repetition allowed e.g. a combination lock

For a 2 digit lock there are 10 possibilities for each digit so there are 10x10 = 100 different ways, the problem, therefore, is O(10n)

  • No repetition allowed e.g. Removing 3 differently coloured balls from a bag 1 at a time

For 3 balls in a bag, removing 1 at a time there are 3x2x1 possibilities the problem, therefore, is O(n!)

Order of efficiency of algorithms from Best to Worst: log n, n, n2, 2n, n!

3 of 19

T2 - Searching Algorithms

Linear Search - looking at each card in a list until the desired card if found, the average number to be examined is n/2, worst case it is n cards, a linear search has time complexity O(n)

Binary search - Very efficient way or searching a sorted list by looking at the middle term, determining which side the target is on and eliminating the other side until the item is found. For a list of 2n items, the maximum number of items is n + 1 and so has time complexity O(log n)

Binary Search trees - Holds items in such a way that the tree can be searched quickly and has time complexity O(log n)

Unbalanced binary tree would have time complexity O(n) as you would need to search every item

4 of 19

T3 - Bubble and Insertion Sort

Often the number of items to be sorted is huge, so choosing an efficient algorithm is very important with quicksort sorting items 3000 times faster than bubble sort for 100,000 items, another factor you may need to consider is the time to code it or the memory requirements

Bubble Sort - Useful for a small number of items as to sort n items n - 1 passes are made, swapping adjacent items if necessary (in the algorithm at the swap point items need to be able to be temporarily stored. As a result, it has time complexity O(n2) as there are 2 nested loops

5 of 19

T3 - Insertion Sort

Efficient sort for a small amount of item (e.g. sorting a deck of cards)

Sorting is done by comparing the current nodes to the nodes before and after it and placing it in its correct position, as a result, the algorithm sorts a list in a single pass

In the insertion sort, there are 2 nested loops with the outer loop performed n-1 times and the inner loop a maximum of n-1 times, so it has time complexity O(n2)

Even though they have the same time complexity, insertion sort is generally faster

6 of 19

T4 - Merge Sort

Merge Sort - Much more efficient than a bubble sort, achieved by dividing an unsorted list into n sublists each with 1 element, then repeatedly merge sublists to produce new sorted sublists until there is only 1 list which is now sorted

Merge sort is another 'divide and conquer' algorithm which halves the size of each list to be merged, the list is first split into halves log n times which then require n operations to merge back to a single list so time complexity is O(n log n)

Fastest Sort - Merge/Quick - O(n log n)

Slowest - Bubble – O(n2)

Recursive – Merge + Quick

7 of 19

T4 - Quick Sort

Quick Sort - Another 'divide and conquer' algorithm which is much faster than any other sorts

Finds a split point (the point where pivot value lies in the list) dividing the list into sublists with 1 containing items less than the point and 1 with items greater. The first step is to choose a value called the pivot value (usually 1st item)

Uses left and right pointers with left pointer finding an item larger than the pivot and right finding a value less than the pivot, they then swap, repeated until pointers cross over at which that is the split point. The process is repeated on each half of the list

For list of length n if the split point is in the middle there will be log n divisions, in order to find the split point each of the items needs to be checked so best case time complexity is O(n log n)

If split point is skewed to one side the division will be uneven so you will be sorting 2 sublists of 0 and n-1 items giving a worst-case time complexity of O(n2) this is the same for a list in reverse order

Pivot does not have to be first value, you can look at first, middle and last values and choose the one in the middle, this would be repeated for each subsequent pivot

8 of 19

T5 - Graph Traversal Algorithms

Depth-First - Go as far down one route as possible then backtrack and go down next path

  • Uses a stack to keep track of the last node visited and a list to hold the names of nodes that have been visited, when all items have been popped from the stack every node has been visited
  • Recursive algorithm as it calls itself
  • Subroutine below returns a list of nodes in the order they were visited (traversal order does not matter)

Applications - Job-Scheduling, Finding a path between two vertices, navigating a maze

Graphs can be used to represent mazes with dead ends as a bar and nodes being the entrance and exit and points where more than 1 path is available

9 of 19

T5 - Breadth-First

Breadth-First - Explore all neighbours of current vertex, then the neighbours of each of those vertices and so on where all nodes adjacent to A are visited before B

  • Process repeated for each node at this level before moving to next level
  • Uses a queue instead of a stack to keep track of nodes needed to be visited

The first and adjacent nodes to the current node are placed in the queue to be visited with their adjacent nodes being placed at the back of the queue. Once a node has been visited it is moved to the visited list.

Applications - Shortest path between 2 points (packet routing), Web crawler, Facebook looking at friends

As both algorithms visit every vertex and explore every edge:

  • For a sparse graph, operations is n + e (n = vertices, e = edges) which gives a time complexity of O(n)
  • For a graph with maximum edges, there are ne2+e operations giving a time complexity of O(n2)
10 of 19

T5 - Tree Traversal

Depth-First Tree Traversal - A tree is a special graph with no cycles so it is only able to go down the children

3 methods are Pre-Order, In-Order and Post-Order with Pre-Order and Depth-First producing the same order

Depth-First Tree Traversal -  Go down each layer until you reach the bottom where there are no more children

11 of 19

T6 - Optimisation Problems

Shortest path problems - Determining which is the shortest path to send packets, circuit board design, game strategy, route planning, project management

Dijkstra's Shortest Path Algorithm - Designed to find the shortest path between a start node and all other nodes in a weighted graph with the least expensive travel costs from the start to that node noted next to it

  • Uses a priority queue to keep a record of which vertex is next
  • Initially assigns a temporary distance value to each node which is 0 for the start node and ∞ at every other node

Computable Problems - A problem is computable if there is an algorithm that can solve every instance of it in a finite number of steps, an incomputable problem is a password that takes millions of years to be solved so they are in a practical sense insoluble. Weather forecasting has to be computed within a few hours to be useful

Limits of Computation - Imposed by algorithmic complexity and hardware i.e. problems that were insoluble 50 years ago are now computable

12 of 19

T6 - The Travelling Salesman Problem (TSP) + Tract

The Travelling Salesman Problem (TSP) - “Given a list of towns and the distances between each pair of towns, what is the shortest possible route that the salesman can use to visit each town exactly once and return to the starting point”

TSP is a very well-known optimisation problem with applications in planning, logistics, transportation, manufacture of circuit boards and many other fields

One method is the Brute Force Method where all possible routes are tested, for 5 towns 4x3x2x1 = 24 possible routes and so this problem becomes impossible to solve within a reasonable time (for 10 cities this is over 3 million routes) so we can the problem intractable with time complexity O(n!)

Tractable Problems - Have Polynomial-time solution or better i.e. O(n), O(n2), O(nk) with O(log n) being the best

Intractable Problems - One that does not have polynomial-time solution i.e. O(2n) and O(n!) are very inefficient and so even though they may have a solution it is impossible to solve within a reasonable time for values of n greater than something very small, Intractable does not mean insoluble

13 of 19

T6 - Heuristic Methods & A* Algorithm

Not all intractable problems are equally hard so it may be relatively easy to get an answer that is good enough, Heuristic approach is one that tries to find a solution that may not be perfect but is adequate for its purpose i.e. For the TSP a solution within 2-3% of the optimal route can be found quickly. Heuristics are often used by virus scanners.

A heuristic approach is often used in decision making:

  • Using a Rule of thumb
  • Making an Educated guess
  • Using Common sense

A* Algorithm - Dijkstra's algorithm is a special case of this, the A* algorithm has 3 cost functions:

  • g(x) - Real cost from source to given node (distance, time, etc)
  • h(x) - Approximate cost from node x to the goal node

h(x) is a heuristic function; a good or adequate solution but not necessarily the optimum one

Total cost of each node f(x) = g(x) + h(x), where h(x) should never be greater than g(x)

Algorithm only focuses on reaching goal node, unlike Dijkstra's which finds the lowest cost or shortest path to every node (So it is often used in computer games to enable characters to navigate the world)

14 of 19

Algorithm Practice - Linear Search

  • function LinearSearch(namelist,nameSought)
  • index = -1
  • i = 0
  • Found = False
  • while i < length(namelist) AND NOT Found
  • if namelist[i] == nameSought then
  • index = i
  • Found = True
  • endif
  • i = i + 1
  • endwhile
  • return index
  • endfunction
15 of 19

Algorithm Practice - Binary Search Non-Recursive

  • function BinarySearch(alist,itemSought)
  • index = -1
  • Found = False
  • first = 0
  • last = len(alist) - 1
  • while first <= last AND Found = False
  • midpoint = (first + last) DIV 2
  • if alist[midpoint] = itemSought
  • Found = True
  • index = midpoint
  • else
  • if alist[midpoint] < itemSought
  • first = midpoint + 1
  • else
  • last = midpoint - 1
  • endif
  • endif
  • endwhile
  • return index
  • endfunction
16 of 19

Algorithm Practice - Binary Search Recursive

  • function binarySearch(alist, itemSought, first, last)
  • if last < first
  • return -1
  • else
  • midpoint = (first + last) DIV 2
  • if alist[midpoint] > itemSought
  • return binarySearch(alist, itemSought, first, midpoint - 1)
  • else
  • if alist[midpoint] < itemSought
  • return binarySearch(alist, itemSought, midpoint + 1, last)
  • else
  • return midpoint
  • endif
  • endif
  • endif
  • endfunction
17 of 19

Algorithm Practice - Binary Sort

for i = 0 to n - 2

for j = 0 to (n-i-2)

if a[j] > a[j+1]

temp  = a[j]

a[j] = a[j+1]

a[j+1] = temp

endif

next j

next i

18 of 19

Algorithm Practice - Insertion Sort

procedure insertionSort(aList)

for j = 1 to len(alist) - 1

nextItem = alist[j]

i = j - 1

while i >= 0 and alist[i] > nextItem

aList[i+1] = alist[i]

i = i + 1

endwhile

aList[i+1] = nextItem

next j

endprocedure

19 of 19

Comments

No comments have yet been made

Similar Computing resources:

See all Computing resources »See all Algorithms resources »