[erlang-questions] Erlang Cost Model

Fri Sep 18 11:59:53 CEST 2015

> Note that O(1) does not mean fast - it just means "constant" time
> without knowing *what* the constant is this information is useless.

It's not completely useless. More than anything it tells you that this
function scales perfectly, independent on the amount of data you shovel
into the function. Of course, the definition of O is such that O(9999) = O(1).

> O(N) means the time depends linearly upon N - but again
> I doubt if anybody really knows if a function is O(N) - when you run
> out of cache or swap space it will certainly not be linear.

Again, the big O notation was never meant for exact measurements of how
many units of time an algorithm takes to complete on an arbitrary
architecture. It is a *very* theoretical tool that requires good
understanding to be usable in practice. It doesn't take into account
implementation-specific details like cache or swap space. It doesn't
even require the definition of a general computation machine (think a
Turing Machine) beforehand. The notation is defined purely on
mathematical functions

O(g) = {f | ∃c,n0 ∈ N : ∀n ≥ n0 : f(n) ≤ cg(n)},

and it is up to us to interpret these so that we actually get anything
interesting out of it. What this notations is good for is describing the
relation between the size of input and the amount of steps we can expect
the algorithm to take to finish^. As such, if we have n data and an
algorithms that runs in O(n^2), we know that if we double n, we can
expect the running time to quadruple. Of course, this is just an
estimate and in reality we can expect things to get a bit more
complicated with concurrency or more frequent cache invalidation.

I agree with the rest of what You said, but I think it is important to
understand the big O notation before we attempt to use it at all, as
then we might end up expecting a whole lot more from it than we're
actually getting, and when things go south, blame the notation instead
of ourselves. :)

[^] This is a simplification. What we mean by the variable n can change
depending on the class of algorithms we are talking about. For example,
sorting algorithms define n to be the amount of elements we are to sort
and define g(x) such that it describes the amount of comparison
operations that will take place, e.g. Quicksort executes nlog(n)
comparison operations when sorting n elements (in the average case).

Cheers,
A. K.