19

Mar

The tale of two algorithms, importance of algorithm analysis in our daily programming tasks

Quoting Wikipedia: “In computer science, the** analysis of algorithms** is the determination of the amount of resources (such as time and storage) necessary to execute them. Most algorithms are designed to work with inputs of arbitrary length. Usually, the efficiency or running time of an algorithm is stated as a function relating the input length to the number of steps (time complexity) or storage locations (space complexity).”

**“Algorithm analysis is an important part of a broader computational complexity theory, which provides theoretical estimates for the resources needed by any algorithm which solves a given computational problem. These estimates provide an insight into reasonable directions of search for efficient algorithms.”**

This leads us to another important concept, quoting Victor S.Adamchik, CMU, 2009:

“**Algorithmic complexity** is concerned about how fast or slow particular algorithm performs. We define complexity as a numerical function `T(n)`

– time versus the input size n. We want to define time taken by an algorithm without depending on the implementation details. But you agree that `T(n)`

does depend on the implementation! A given algorithm will take different amounts of time on the same inputs depending on such factors as: processor speed; instruction set, disk speed, brand of compiler and etc. The way around is to estimate efficiency of each algorithm asymptotically. We will measure time `T(n)`

as the number of elementary “steps” (defined in any way), provided each such step takes constant time.

Let us consider two classical examples: addition of two integers. We will add two integers digit by digit (or bit by bit), and this will define a “step” in our computational model. Therefore, we say that addition of two n-bit integers takes n steps. Consequently, the total computational time is `T(n) = c * n`

, where `c`

is time taken by addition of two bits. On different computers, addition of two bits might take different time, say `c1`

and `c2`

, thus the addition of two n-bit integers takes `T(n) = c1 * n`

and `T(n) = c2* n`

respectively. This shows that different machines result in different slopes, but time `T(n)`

**grows linearly as input size increases**.”

So, for this post, I will write about the** importance of writing efficient code** (memory and computations). For this, lets set up the experiment.` `

Assume we have **two algorithms that performs the same task** (say, sort an array of numbers), `f1`

and `f2`

, with their complexities explained below:

Complexity of `f1`

algorithm:

Complexity of `f1`

algorithm (Asymptotic notation):

Complexity of `f2`

algorithm:

Complexity of `f2`

algorithm (Asymptotic notation):

Here we can see that `f1`

has a quadratic complexity and `f2`

has a linear complexity.

Now, to make thinks way more interesting, say that** we also have two computers,** one is a supercomputer that can perform `1e10`

calculations per second and a home computer that ` performs `

`1e6`

calculations per second. Therefore, the supercomputer is 10,000 faster than the home computer!

Supercomputer will be using algorithm `f1`

and the home computer will be using algorithm `f2`

.

Counting the number of operations performed by each algorithm:

When `n=500`

we observe that algorithm `f1`

outperforms by making less operations than the algorithm `f2`

.

When `n=1000`

we observe that for this case, algorithm `f2`

is outperforming algorithm `f1`

. Therefore, we can suggest that algorithm `f1`

may be a better implementation for small values of `n`

(say, for example for values of `n<=600`

).

When `n=10000`

we observe that algorithm `f1`

is making way more operations that algorithm `f2`

. This result suggest that algorithm `f1`

is not suited for large input sizes. For this test, algorithm `f2`

is the winner!

Comparing computer performances:

Remember that the supercomputer will be using algorithm `f1`

and the home computer will be using algorithm `f2`

. Supercomputer is 10,000 faster than the home computer.

When `n=1e7`

we observe that supercomputer (running algorithm `f1`

) outperforms by taking less hours to complete the task.

When `n=1e8`

we observe that supercomputer (running algorithm `f1`

) is taking approximately 600 hours to complete and the home computer (running algorithm `f2`

) is taking approximately 100 hours.

When `n=1e10`

we observe that **supercomputer** (running algorithm `f1`

) is **taking approximately 600 years to complete** and the **home computer** (running algorithm `f2`

) is **taking approximately 1.5 years**.

We observed that the supercomputer (regardless of being 10,000 faster than the home computer) using an inefficient implementation of an algorithm, is easily outperformed by a simple $500.00 USD home computer.

So, regardless of how much CPU power and Gigs of RAM memory you have in your supercomputer, if the algorithm is coded poorly (not optimized), your hardware will not help much.

That’s my advice and my suggestion to your daily programming, I personally think that the current hardware capabilities we have at out disposition is not a excuse to write *hungry-redundant* code (in memory, and in computations respectively). I remember that my home computer (back in 2000) had 8gb on hard drive space, 256mb of RAM and a 0.1 GHz processor. Then, please do not write code that eats 60Gb of RAM to open a `*.doc`

file

You can download the code clicking here

Tags: Computer Science, Machine Learning, R