The sorting problem 

Given a sequence of n elements x1, x2, . . . , xn ∈ S, rearrange the elements according to some ordering criteria.

Definition: A sorting method is said to be stable if it preserves the relative order of duplicate keys in the file.

This is important to build efficient searching algorithms and data structures, data compression. This is a heavily studied problem in computer science, with several widely celebrated algorithms. 

Today we will be looking at brute force sorting solutions. As the name suggests, many of these are quite inefficient to run on large datasets but are very quick to implement so still hold a place in everyday problem-solving.


Selection sort

Selection sort is a brute force solution to the sorting problem. 

1
 
Scan all n elements of the array to find the smallest element, and swap it with the first element.
 
 
 
2
 
Starting with the second element, scan the remaining n − 1 elements to find the smallest element and swap it with the element in the second position.
 
 
 
 
Generally, on pass i(0 ≤ i ≤ n − 2), find the smallest element in A[i . . . n − 1] and swap it with A[i].

Now, let's analyse selection sort with this pseudo code...

ALGORITHM SelectionSort (A[0 . . . n − 1])
/* Order an array using a brute-force selection sort. */
/* INPUT : An array A[0 . . . n − 1] of orderable elements. */
/* OUTPUT : An array A[0 . . . n − 1] sorted in ascending order. */
1. for i=0 to n-2 do
2.     min = i
3.     for j=i+1 to n-1 do
4.         if A[j] < A[min] then
5.            min = j
6.         end if
7.     end for
8.     swap A[i] and A[min]
9. end for
Best Case Ω(n2)
Average Case Θ(n2)
Worst Case O(n2)

Needs around n2/2 comparisons and at most n − 1 exchanges. The running time is insensitive to the input, so the best, average, and worst case are essentially the same.

Why use it? Selection sort only takes O(n) writes but O(n2) reads. When writes (to an array) are much more expensive than reads, selection sort may have an advantage, e.g., flash memory. Also, for small arrays, selection sort is relatively efficient and simple to implement.

Question: Is selection sort stable?
Answer: No, selection sort is not stable, as the above example show the 5’s do not maintain their relative order.

Bubble Sort

A bubble sort iteratively compares adjacent items in a list and exchanges them if they are out of order. One of the classic (and elementary) sorting algorithms, originally designed and efficient for tape disks, but with random access memory, it doesn’t have much use these days. However, it is still insightful to study it and to understand why other sorting algorithms are superior in one or more aspects. Best of all, It is simple to code!

1
 

The first iteration, compare each adjacent pair of elements and swap them if they are out of order. Eventually, the largest element gets propagated to the end.

 
 
 
2
 
The second iteration, repeat the process, but only from first to 2nd last element (the last element is in its correct position). Eventually, the second largest element is at the 2nd the last element.
 
 
 
 
Repeat until all elements are sorted.

Now, let's analyse bubble sort with this pseudo code...

ALGORITHM BubbleSort (A[0 . . . n − 1])
/* Order an array using a bubble sort. */
/* INPUT : An array A[0 . . . n − 1] of orderable elements. */
/* OUTPUT : An array A[0 . . . n − 1] sorted in ascending order. */
1. for i=0 to n−2 do
2.     for j=0 to n−2−i do
3.         if A[j + 1] < A[j] then 
4.             swap A[j] and A[j + 1]
5.         end if 
5.     end for
6. end for
Best case

if the original file is already sorted, about n2/2 comparisons & 0 exchanges

O(n2)
Worst case

if the original file is sorted in reverse order, about n2/2 comparisons & n2/2 exchanges

O(n2)
Average case

if the original file is in random order, about n2/2 comparisons & less than n2/2 exchanges

O(n2)
Question: Is bubble sort stable?
Answer: Bubble sort is stable, as for each pair, we only ever swap if out of order, hence two equal elements will not be swapped.

Early-Termination Bubble Sort

This modification attempts to reduce redundant iterations, by checking if any exchanges take place in each pass. If there were no exchanges in the current iteration, the sorting is stopped after the current iteration. How does this affect the overall efficiency?

Best case

when the original file is already sorted, only one pass is needed, n − 1 comparison, 0 exchanges

O(n)
Worst case

No improvement over the original implementation

O(n2)
Average case

Depending on the data set, few iterations can be eliminated at the end of the sort. Therefore, the number of passes is less than n − 1, and hence cost is lower than the original implementation.

O(n2)

Computational Geometry

The Convex Hull Problem

The convex hull of a set of points is the smallest convex polygon that contains all the points, i.e., all the points are “within” the polygon.

Idea: If we can identify all the line segments/adjacent pairs of points that form the boundary of the convex hull, then we have the convex hull.

Complexity of algorithm: Since there are O(n2) pairs of points to examine, and each check requires going through O(n) remaining points, the algorithm is O(n3).

 

The above video illustrates the use of the convex hull problem in a simple animation, think of a modern game and the number of times the convex hull is needing to be calculated every second.


Exhaustive Search

A brute force solution involving enumerating/generating all possible solutions, then selecting the “best” one. The idea is to Generate a list of all potential solutions to the problem in a systematic manner. Evaluate potential solutions one by one, disqualifying infeasible ones, and keeping track of the best one found so far. When all items have been evaluated, announce the best solution(s) found. Typically applied to combinatorial problems, and insightful to study brute force solutions to them, as some problems can only be solved optimally by exhaustive search.

Knapsack Problem

Given n items of known weights w1,...,wn and the values v1,...,vn and a knapsack of capacity W, find the most valuable subset of the items that fit into the knapsack.

Algorithm:

  1. 1.  Consider all subsets of the set of n items.

  2. 2. Compute the total weight of each subset in order to identify feasible subsets (the ones with the total not exceeding the knapsack’s capacity).

  3. 3.  Find the subset of the largest value among them.

Complexity: Since the number of subsets of an n-element set is 2n, an exhaustive search produces an O(2n) algorithm.

Let's take a look at an example, we have a knapsack with a maximum weight of 10kgs, given the following items, what is the maximum value we can fit in our bag without exceeding the weight?

Item Weight(kgs) Value
1 7 $42
2 3 $12
3 4 $40
4 5 $25

Applying the algorithm defined above, we consider all possible subsets to identify which combination of items results in the highesy value.

Subset Total Weight(kgs) Total Value
null 0 $0
{1} 7 $42
{2} 3 $12
{3} 4 $40
{4} 5 $25
{1, 2} 10 $36
{1, 3} 11 Not Possible
{1, 4} 12 Not Possible
{2, 3} 7 $52
{2, 4} 8 $37
{3, 4} 9 $65
{1, 2, 3} 14 Not Possible
{1, 2, 4} 15 Not Possible
{1, 3, 4} 16 Not Possible
{2, 3, 4} 12 Not Possible
{1, 2, 3, 4} 19 Not Possible

In the next post we will look at exhaustive search further in graph traversal methods.