A Gentle Introduction to Algorithm Complexity Analysis

modernerd · on June 14, 2018

The explanation of Big O in Grokking Algorithms was clearest for me. It doesn't assume any knowledge and even re-introduces logarithms and binary search before using them in examples:

- Logarithms and binary search: https://livebook.manning.com/#!/book/grokking-algorithms/cha...

- Big O: https://livebook.manning.com/#!/book/grokking-algorithms/cha...

ultrasounder · on June 14, 2018

THIS book was my first serious introduction to anything BigOh. Playful yet dead nuts on the mission to introduce self taught programmers like me the fundamentals the right way.

vram22 · on June 14, 2018

>It doesn't assume any knowledge and even re-introduces logarithms and binary search before using them in examples

So does the OP.

_mlxl · on June 14, 2018

When I was a computer science undergrad, this is the one video that made algorithmic complexity stick in my head, hopefully someone else finds it as useful as I did;

https://www.youtube.com/watch?v=phbu-K4tlQg

asrp · on June 14, 2018

This is a good video but it gets the definition of omega and theta wrong, or at least their definition is very non-standard.

By default, O, omega and theta all refer to the worst case running time.

    - O(n) means worst_case_run_time <= C n
    - omega(n) means worst_case_run_time >= C n
    - theta(n) means C1 n <= worst_case_run_time <= C2 n

If you want to talk about something other than worst case, you usually just say it in words, like "the average complexity of this algorithm is O(n^2)".

(I can't remember when someone has ever looked at "best case".)

To rephrase

- "This algorithm is O(something)" means I have an upper bound on the running time on the worst inputs.

- "This algorithm is omega(something)" means I have a lower bound on the running time on the worst inputs.

- "This algorithm is theta(something)" means I have both an upper and lower bound on the running time on the worst inputs (that differ only by a constant multiple.

In particular, a lower bound does not mean "best case" (they aren't even in the same part of the sentence).

(There's also some other detail from the video like the bit representation of string length that needs a footnote but that's much less important, I think.)

j2kun · on June 14, 2018

Big-O typically refers to worst case when people discuss algorithms, but its definition has nothing to do with runtime, or even algorithms, at all. It was invented by mathematicians decades before computers existed.

It's also more nuanced than the inequality in your correction. It's a statement about limits of functions.

asrp · on June 14, 2018

Thanks for the clarification. I never really know how much (accuracy) to put in a post. I'm assuming that part of it is what turned some folk away from learning complexity in the first place. (I have intuition but not a firm grasp of what turns people away either.)

The implicit default of "worst-case running time" (in the context of algorithm) may actually be the source of errors like the one made in the video. For what its worth, I actually think the lim sup definition is easier because its fewer things to memorize.

pmattos · on June 14, 2018

Nice overview ;) Minor typo in theta(n): the bounds should be C1 n and C2 n.

asrp · on June 14, 2018

Thanks, fixed!

hackermailman · on June 14, 2018

This Parallel & Sequential Algorithms book from CMU uses a simpler language cost model instead of the traditional machine based model to do analysis if anybody is interested http://www.parallel-algorithms-book.com/

When using a machine model, we have to reason about how the algorithm compiles and runs on that machine. For example, if we express our algorithm in a low-level language such as C, cost analysis based on a machine model that represents a von Neumanm machine is straightforward because there is an almost one-to-one mapping of statements in C to the instructions of such a machine. For higher-level languages, this becomes trickier. There may be uncertainties, for example, about the cost of automatic memory management, or the cost of dispatching in an object-oriented language. For parallel programs, cost analysis based on machine-based models even more tricky, since we have to reason about how parallel tasks of the algorithm are scheduled on the processors of the machine.

strainer · on June 15, 2018

    for ( var i=0; i<n; ++i ) {..

In Javascript this will loop from 0 to n-1 inclusive regardless of using ++i or i++, because that loop 'stepping statement' happens after the loop body.

Its possible to combine the step in the test to go from 1 to n-1

    for ( var i=0; ++i<n; ) {..

best to be aware of the quirk and just do

    for ( var i=1; i<n; i++ ) {..

vstuart · on June 17, 2018

Excellent article (previous discussion on HN here: https://news.ycombinator.com/item?id=4200846)!