Complexity

Please note that all definitions here are for positive functions with one integer argument, but are almost identical for multivariate functions from other sets.

Notation Big-Oh (\(\mathcal{O}\))

\[f(n) \in \mathcal{O}(g(n)) \quad \Longleftrightarrow \quad \exists k \in \mathbb{R^+}, n_0 \in \mathbb{N} \quad \text{ s.t. } \quad f(n) \leq k \cdot g(n) \quad \forall n \geq n_0\]

A function \(f(n)\) is said to belong to \(\mathcal{O}(g(n))\) if there is a constant \(k\), such that \(k \cdot g(n)\) is systematically greater than \(f(n)\) for all \(n\) large enough (that is, there is a \(n_0\) from which the rule is satisfied).

\(g(n)\) therefore works as an upper bound on the function up to a constant.

Example

Let \(f(n) = 2n^2+3n\). We have that \(f(n)\in \mathcal{O}(n^2)\) (in other words, we choose \(g(n)=n^2\)). Indeed, with \(k=3\), the rule is respected from \(n=3\).

Similarly, the same function \(f(n) = 2n^2+3n\) belongs to other sets:

  • \(f(n) \in \mathcal{O}(n^3)\)

  • \(f(n) \in \mathcal{O}(n^4)\)

  • \(f(n) \in \mathcal{O}(2^n)\)

because all these functions are upper bounds than \(n^2\) when \(n\) is large.

In the majority of cases, we will want to choose the smallest possible \(g(n)\) function that respects the property, since it will give us the most information.

Notation Big-Omega (\(\Omega\))

The definition is similar to that of Big-Oh. In bold the differences:

\[f(n) \in \mathbf{\Omega}(g(n)) \quad \Longleftrightarrow \quad \exists k \in \mathbb{R^+}, n_0 \in \mathbb{N} \quad \text{ s.t. } \quad \mathbf{k \cdot f(n) \geq g(n)} \quad \forall n \geq n_0\]

For large values of \(n\), \(f(n)\) is always greater than \(g(n)\) to a constant close. Concretely, this means that the function \(g(n)\) places a bound lower on the complexity of \(f(n)\). In other words, \(g(n)\) characterizes the « best case » possible for the calculation of f(n) (this is an abuse of language: see below).

Example

In the general case, Insert outputs \(\in \Omega(n)\).

Notation Big-Theta (\(\Theta\))

\[f(n) \in \mathbf{\Theta}(g(n)) \quad \Longleftrightarrow \quad \exists k_0,k_1 \in \mathbb{R^+}, n_0 \in \mathbb{N} \quad \text{ s.t. } \quad \mathbf{k_0 \cdot g(n) \leq f(n) \leq k_1 \cdot g(n)} \quad \forall n \geq n_0\]

In other words, for large values of \(n\), \(f(n)\) behaves like \(g(n)\) up to a multiplicative constant. \(g(n)\) thus acts as both a lower and an upper bound.

One can easily see that (proof left as an exercise)

\[f(n) \in \mathbf{\Theta}(g(n)) \quad \Longleftrightarrow \quad f(n) \in \mathbf{\mathcal{O}}(g(n)) \quad\wedge\quad f(n) \in \mathbf{\Omega}(g(n))\]

Notes

It is not possible to find a \(g(n)\) function such as \(f(n) \in \Theta(g(n))\) for any \(f(n) function )\). For example, for Insertion sort, Since it is in \(\mathcal{O}(n^2)\) but in \(\Omega(n)\), and that these two bounds are reached, it is not possible to say that Insertion is in \(\Theta(g(n))\).

Example

Merge sort is in \(\Theta(n\log_2 n)\).

Notation Tilde (\(\mathcal{\sim}\))

The definition of tilde notation is based on different principles:

\[f(n) \sim g(n) \quad \quad \Longleftrightarrow \quad \lim_{n\rightarrow\infty} \frac{f(n)}{g(n)} = 1\]

This a priori more complicated definition simply allows us to see that for large values of \(n\), \(f(n)\) and \(g(n)\) behave the same way: the intuition is therefore somewhat the same as for \(\mathcal{O}\). Besides, we also have:

\[f(n) \sim g(n) \quad \quad \Longrightarrow \quad f(n) \in \mathcal{O}(g(n))\]

But the opposite relationship is not true. Indeed, if we take the example of an algorithm with an execution time \(A\) which needs to go through a list twice, we have:

\[A(n) \not\sim n \quad \text{since} \quad \lim_{n\rightarrow\infty} \frac{A(n)}{n} = 2\]

This example shows us the main difference between \(\mathcal{O}\) and \(\sim\): tilde keeps the multiplicative factor.

There is another difference: tilde provides a reached bound. For example: according to the definition of \(\mathcal{O}\), we have that

  • \(n \in \mathcal{O}(n)\)

  • \(n \in \mathcal{O}(n^2)\)

  • \(n \in \mathcal{O}(2^n)\)

because \(n\), \(n^2\) and \(2^n\) all eventually become « bigger » than \(n\). Now, we have that

  • \(n \sim n\) (of course …)

  • \(n \not\sim n^2\)

  • \(n \not\sim 2^n\)

because the limit of the latter two functions tends to 0, not 1!

There are other more subtle differences, which we will discuss in exercises.

Best case, worst case, average case

We too often hear that \(\mathcal{O}\) is the worst case and \(\Omega\) is the best case. This is false in general, depending on how you define your function.

Let’s say we’re using a Quick Sort algorithm, which we’ll see in Part 2 of the course. If you define \(f(n)\) as « the number of comparison operations to perform for an array of size n », then you have:

  • \(f(n) \sim n^2\) et \(f(n) \in \mathcal{O}(n^2)\)

  • \(f(n) \in \Omega(n\log_2 n)\)

If you now define \(g(n)\) as « the average (expectation) number of comparison operations to perform for an array of size n, when selecting arrays uniformly », you get:

  • \(g(n) \sim n\log_2 n\) et \(g(n) \in \mathcal{O}(n\log_2 n)\)

  • \(g(n) \in \Omega(n\log_2 n)\)

  • et donc \(g(n) \in \Theta(n\log_2 n)\)

By a (slight) abuse of language, we say that the « average case » of Quick Sort is in \(\Theta(n\log_2 n)\). But the general case is not!

Amortized Comlexity

Another useful type of complexity is that which counts the average complexity for \(m\) operations. This complexity is called the amortized complexity. For example, an ArrayList in java is implemented with an array that doubles its size as soon as its capacity is reached. The doubling of size operation is done in \(\mathcal{O}(n)\) where \(n\) is the current size of the array Inserting \(n+1\) operations with the add(E e) method when the array has a current size of \(n\) will cost on average \(\mathcal{O}(1)*n+\mathcal{O}(n)/(n+1)=\mathcal{O}(1)\).

Warning: the complexity of the method add(E e) in isolation is well \(\Omega(1)\) and \(\mathcal{O}(n)\) where \(n\) is the number of elements in the ArrayList.

Frequent complexities

Classe

Nom

Exemple

\(\mathcal{O}(1)\)

Constante

Find min in sorted array

\(\mathcal{O}(\log_2{n})\)

Logarithmic

Binary search

\(\mathcal{O}(n)\)

Linear

Iterate over elements in an array

\(\mathcal{O}(n\log_2{n})\)

Linearithmic

Efficient sorting (e.g. merge sort)

\(\mathcal{O}(n^2)\)

Quadratic

Inefficient sorting (e.g. insertion sort)

\(\mathcal{O}(n^c)\)

Polynomial

Magorith of algorithms in this course

\(\mathcal{O}(c^n)\)

Exponential

Knapsack Problem

\(\mathcal{O}(n!)\)

Factorial

Brute force Solving of the TSP (all permutations)