Choosing which way of organising data plays a vital role in the design and analysis of algorithms. In this section, we will look at some of the different data structures used when working with algorithms.

Algorithms process input data, might need to process some immediate data form, then output some results, which could be stored as data. But we do not want to consider algorithms for each type of input data.

Instead, we seek to talk about the data input, immediate storage and output in terms of general characteristics and operations the data should possess.

These data abstractions are called Abstract data types (ADT). An ADT can be defined by the collection of common operations that are accessed through an interface and its characteristics.


Sets

A set is a collection of distinguishable objects, often called members or elements. Sets can be finite or infinite and do not impose any order on the members. Typical operations include add, remove, search. Each element may only appear in the set once. If elements may appear multiple times, the resulting collection is referred to as a bag or multiset.

Simple examples of sets include:

(1)
Binary : S1 = {0, 1},
(2)
Character : S2 = {c, a, y, s, t},
(3)
Word : S3 = { car, house, man, sat, truck },
(4)
Delimiter : S4 = {{!}, {; }, {?}, {.}},
(5)
Variable, prefix-free : S5 = {0, 10, 110, 1110}.

Sequences

A sequence is a collection of elements in which the order of the elements must be maintained, and elements can occur any number of times. A sequence is simply a multiset in which the order is important. Typical operations include add, remove, search. In computer science, can also be referred to as a list (an ordered collection of elements).

Examples of sequences include:

(1) Binary : T1 = “101010110001”,
(2) English : T2 = “The red car belongs to me.”,
(3) Genomic : T3 = “gattcaggaatccgccggtaacgcgcatataattt”.

Linear Data Structures

Array

A (one-dimensional) array is a sequence of items with the same data type stored one after the other in computer memory. Each item in an array is accessible by specifying the value of the array's index. This means that the value of each item can be accessed very quickly regardless of where it is located in the structure.

visual representation of array


Linked list

A linked list is a sequence of zero or more elements called nodes, each containing two kinds of information: some data and one or more links called pointers to other nodes of the linked list. (A special pointer called “null” is used to indicate the absence of a node’s successor.) In a singly linked list, each node except the last one contains a single pointer to the next element.

To access a particular node of a linked list, one starts with the list’s first node and traverses the pointer chain until the particular node is reached. This means that the time requires to access an element of a singly linked list, unlike that of an array, depends on where in the list the element is located, you can simply access an item in the middle of the list without first reading each item that precedes it. 

On the positive side, linked lists do not require any preliminary reservation of the computer memory, so we don't need to know how big out dataset is beforehand, and insertions and deletions can be made quite efficiently in a linked list by reconnecting a few appropriate pointers.

Another extension is the structure called the doubly linked list, in which every node, except the first and the last, contains pointers to both its successor and its predecessor.


Stack

A stack is a collection with two defining operations, push and pop. Push adds a new element at top of the stack, pop removes an element at the top of the stack. Stack abstract data type implements LIFO principle (last in, first out).


Queue

A queue is a collection with two defining operations, enqueue and dequeue. Enqueue adds new element at back of queue, dequeue removes element at front from queue. Queue abstract data type implements LIFO principle (last in, first out).

A priority queue is a queue that has a priority associated with each element it holds. Enqueue is the same as a (non-priority) queue, but for dequeue, the element with the highest priority is removed first. Think of a queue for boarding planes, passengers who are members of the airlines or elderly have priority board plane (dequeue).


Graphs 

A graph G = V, E is defined by a pair of two sets: a finite set V of items called vertices and a set E called edges, representing links/relations/connections between pairs of vertices. A graph G is undirected if the edges do not have a direction (i.e., all of the pairs of vertices in E are unordered). A graph G is directed if the edges from a direction (i.e., all of the pairs of vertices in E have an ordering imposed).

A weighted graph (or weighted digraph) is a graph (or digraph) with numbers assigned to its edges. These numbers are called weights or costs. An interest in such graphs is motivated by numerous real-world applications, such as finding the shortest path between two points in a transportation or communication network or the travelling salesman problem mentioned earlier.


Sets & Dictionaries

Map

A dictionary/map is a collection of (key, value) pairs, such that each key can only appear once in the dictionary/map. Example of a dictionary of former leaders and country they led:

Key
Value
Vladimir Lenin
Soviet Union
Winston Churchill
UK
Mao Ze Dong
China
Kevin Rudd
Australia
Barack Obama USA