Friday, July 28, 2006

Network flow : integer vs real values

Integer values are assumed in the analysis of most of the algorithms based on network flow. The reason for this is as follows.....

If the capacities on all the edges are integral then there always exists an integral flow.

If the capacities are rational numbers we can take the LCM and apply the same network flow algorithm (ford-fulkerson) by choosing augmenting paths arbitrarily.

If the capacities are real numbers, ford-fulkerson algorithm runs in polynomial time if augmenting paths are chosen via BFS. The algorithm might not terminate if the augmenting paths are chosen arbitrarily.

Assuming integer values makes the analysis much simpler, without worrying about how the augmenting paths are chosen. As said earlier, real-values can be handled efficiently using BFS.

A recent paper [1] analyzes the DFS approach of choosing augmenting paths.

[1] Finite Termination of "Augmenting Path" Algorithms in the Presence of Irrational Problem Data Brian C Dean, Michel X. Goemans, Nicole Immorlica. To appear in the proceedings of the 14th annual European Symposium on Algorithms (ESA), 2006.

Thursday, July 27, 2006

Graph Theory Books

Free online (third) edition of Diestel's Graph Theory book is available here. This is one of the first books on Graph Theory, I read. Graph Theory with Applications by Bondy and Murty is another good introductory book. Its free online edition is here. I find lots of references to this book in graph theory and discrete math publications. Douglas West's Book is another excellent introductory book. Together these three books make an invaluable addition to your theory books library. I like the exercises in these books. If you read the chapters carefully you can come up with elegant and beautiful proofs for the exercises. If books have taglines, I would call these books Graph Theory: Fun Unlimited.

Monday, July 24, 2006

Complexity of Graph Isomorphism

With the weekend temperatures of CA reaching 100, I had no choice but to sit at home in front of my small but efficient table fan. Having nothing else to do, I picked up this book, 'The Graph Isomorphism Problem: Its Structural Complexity', addressing the complexity of the graph isomorphism problem (ISO). The book uses ISO to illustrate the concepts of structural complexity. I learnt more about the structural complexity theory from this book than the ISO itself. This book makes an excellent weekend read (if you have some necessary background in graph theory and computational complexity). I am glad I read this book.

ISO problem is to decide whether there is a bijective mapping from the nodes of one graph to the nodes of the second graph such that the edge connections are respected. It is considered to be a hard problem and is not known to be in P or NP-Complete. Many researchers believe that ISO is not NP-Complete !! A complexity class called GI (graph isomorphism), (which is conjectured to be disjoint from both P and NP-Complete) is defined to understand the complexity of ISO !!

However, NP-Completeness of sub-graph isomorphism can be proved by reduction from HAMILTONIAN-PATH. (duh !)

ISO is in P for the following restricted classes :
  • bounded-degree graphs
  • planar graphs
  • trees (read this book "Algorithms on Trees and Graphs" by Gabriel Valiente for the various versions of tree-isomorphism (labeled vs unlabeled, top-down vs bottom-up))

ISO finds its applications in a wide range of areas like computational biology, data-mining, computational chemistry, matrix theory and so on. If you are looking for software, nauty serves as an excellent practical tool for graph isomorphism.

Friday, July 14, 2006

Number Magic

Guess a positive integer n

if (n is even) {
n = n/2
} else { /* n is odd */
n = 3n + 1
}

Apply this process repeatedly.
This process will always reach the number 1

Eg : 6 -- 3 -- 10 -- 5 -- 16 -- 8 -- 4 -- 2 -- 1

This is Collatz conjecture and remains open till date !!

Thursday, July 13, 2006

CRICKET is NP-Complete

How do we decide the order of batsmen to maximize the cumulative runs made by them in 50 overs ? Here is my perspective......

Let's say we have the following statistics of batsmen based on their past performances.
  • average time spent by a batsman on the field.
  • average runs per over made (during past partnerships) for each pair of batsmen.

Given this data, we would like to decide the order of batsmen to maximize their runs (assuming they don't deviate much from their average past performance).

Formally speaking, we have a complete undirected weighted (vertex-weighted and edge-weighted) graph on n vertices. The vertices represent the batsmen and the weights on the vertices indicate the average time spent by the batsman on the field. The weight w on the edge (u,v) represents the average runs per over made (during past partnerships) by (u,v). That is, if u and v are playing on the field, they can together contribute at the rate of w runs per over. Having a bad partner hurts you, even if you are a high-scorer.

For example, the input graph can be as shown below...




The best schedule for m (=50) overs can be as shown below....



In the above figure, we have two rows representing the two batsmen on the field at any given time. The number of runs for the above schedule is derived by adding the length-of-overlap*edge-weight for each pair of players.

Theorem : Given the input graph G(V,E), finding the optimum (maximizing runs) schedule is NP-Complete.

I will post the proof at a later time in a later post. In the menwhile, you can try your own approaches to prove it. The next obvious question is about the approximability.

This model might not be pratical (due to noise, does not take other factors (like the skills of bowlers and bowler's schedule) into account) in real-life, but serves as a good mathematical puzzle.

P.S. Beginners can read the rules of cricket here. World Cup 2007 Schedule is here. Add this cool countdown to world cup to your google personalized homepage.

Wednesday, July 12, 2006

Transactions Minimization Problem

In April 2000, I visited Agra with one of my friends (Amit) and his roommates (X,Y,Z) . At the end of the tour we ended up owing money to each other, due to the transactions during the trip (Eg : Amit paid for my taxi, I paid for X's food, Y paid for Amit's photo's at taj mahal etc etc.,). At the end of the trip we have to decide "who owes whom how much".

Normal human-beings might solve this problem in a step by step fashion through multiple transactions. Since we are computer sceintists, we will add some "constraints" and solve this problem "efficiently"'.

Formally speaking, we have a weighted directed graph G(V,E) where the vertices represent the people and a directed edge (u,v) with weight w means that u owes w-dollars to v.

For example, the following graph shows that a owes $10 to b, and b owes $20 to c.



There are two ways to perform the transactions.

First way :
  • a withdraws $10 from ATM
  • b withdraws $20 from ATM
  • a pays $10 to b
  • b pays $20 to c

Second way : reduce G to G' as shown below :



  • a withdraws $10 from ATM
  • b withdraws only $10 from ATM
  • a pays $10 to c
  • b pays $10 to c

Second way is better in the following sense :
  • b has withdrawn just as much money as required.
  • there are smaller (less money involved) transactions.
  • often there are lesser transactions (you can construct a complicated example to see this)
  • The transactions are still parallelizable !! (i.e., all the payments can occur simultaneously).

Formally, we want to reduce the graph G(V,E) to G'(V',E') such that,

  1. Net amount (payed/received) is preserved for each person. (I am too lazy to write down the equations)
  2. |E'| is as small as possible (or)
  3. The sum of the weights of the edges in G' is minimized (or)
  4. The maximum weight of an edge in G' is minimized.

Question : Design an efficient algorithm to reduce G to G'

I will stop here and let you solve this problem on your own.....

Monday, July 10, 2006

Graceful Tree Conjecture

Label the vertices of a simple undirected graph G(V,E) (where V = n and E = m) with integers from 0 to m. Now label each edge with absolute difference of the labels of its incident vertices. The labeling is said to be graceful if the edges are labelled 1 through m inclusive (with no number repeated).

A graph is called graceful if it has atleast one such labeling. This labeling was originally introduced in 1967 by Rosa. The name graceful labeling was coined later by Golomb.

Gracefully labeled graphs serve as models in a wide range of applications including coding theory and communication network addressing.

The graceful labeling problem is to determine which graphs are graceful. It is conjectured (by Kotzig, Ringel and Rosa) that all trees are graceful.

Despite numerous (more than 200) publications on graceful labeling for over three decades, only a very restricted classes of trees have been shown to be graceful. These restriced classes include paths, stars, complete bi-partite graphs, prism graphs, wheel graphs, caterpillar graphs, olive trees, and symmetrical trees.