Problem Statement:
Find the LCS or Longest Common Sub-sequence (not necessarily contiguous) of the two given Sequences.
Seq 1: P R A T E E K
Seq 2: R A T H O R E
LCS is : R A T E
Solution :
LCS applications:
1. Molecular Biology :
To compare DNA sequences
2. File Comparison:
3. Screen Redisplay:
As in Mobile display only the changed pixel are updated, instead of updating the whole screen.
This is problem is combinatorial in nature and therefore can be solved by Brute force,
Possible Subset of the given sequences of length m and n will be 2^m and 2^n respectively, which will be matched by one with each other. Thus , the Complexity of the algorithm will be exponential.
Now,
Question is "Can we do Better" as Professor Tim Roughgarden says.
We can see this problem fulfilling the properties of Dynamic Programming.
Principle behind Dynamic Programming says , start with recursive algorithm , which may be inefficient because of repeated computation of sub-problems. Simply remember the solution of sub-problems, which reduce the recomputing and will be simply looked up when required.
Thus computation time reduces to proportional to the number of sub-problems.
Dynamic Programming Properties:
1. Optimal sub-structure :
We can decompose our problem in to smaller sub-sequence and calculate solution for them.
2. Overlapping Sub-Problem:
Our Problem involves kind of recursion.
Given
Seq 1: P R A T E E K
Seq 2: R A T H O R E
Here ,all those characters are picked which come from diagonal cell, i.e. which are marked in orange, cell in green indicates the Longest Common Sub-sequence
Full Source Code: LINK
References:1. http://www.ics.uci.edu/~eppstein/161/960229.html
2. Refer NPTEL videos of design and algoritms (IIT Mumbai)
3. http://www.youtube.com/watch?v=wJ-rP9hJXO0
Please comment and post suggestions
Happy Coding!! :)
Find the LCS or Longest Common Sub-sequence (not necessarily contiguous) of the two given Sequences.
Seq 1: P R A T E E K
Seq 2: R A T H O R E
LCS is : R A T E
Solution :
LCS applications:
1. Molecular Biology :
To compare DNA sequences
2. File Comparison:
3. Screen Redisplay:
As in Mobile display only the changed pixel are updated, instead of updating the whole screen.
This is problem is combinatorial in nature and therefore can be solved by Brute force,
Possible Subset of the given sequences of length m and n will be 2^m and 2^n respectively, which will be matched by one with each other. Thus , the Complexity of the algorithm will be exponential.
Now,
Question is "Can we do Better" as Professor Tim Roughgarden says.
We can see this problem fulfilling the properties of Dynamic Programming.
Principle behind Dynamic Programming says , start with recursive algorithm , which may be inefficient because of repeated computation of sub-problems. Simply remember the solution of sub-problems, which reduce the recomputing and will be simply looked up when required.
Thus computation time reduces to proportional to the number of sub-problems.
Dynamic Programming Properties:
1. Optimal sub-structure :
We can decompose our problem in to smaller sub-sequence and calculate solution for them.
Fig: Optimal Sub-Structure |
Our Problem involves kind of recursion.
Given
Seq 1: P R A T E E K
Seq 2: R A T H O R E
Fig: LCS for given strings |
Full Source Code: LINK
/** * Longest Common Sequnece Subroutine */ public static int lcs(String s1, String s2){ char[] X = s1.toCharArray(); char[] Y = s2.toCharArray(); int[][] LCS = new int[Y.length +1 ][X.length +1 ]; for(int i=1;i<=Y.length;i++) { for(int j=1;j<=X.length;j++) { if(Y[i -1 ]== X[j-1]) LCS[i][j] = 1 + LCS[i-1][j-1]; else LCS[i][j] = Math.max(LCS[i-1][j], LCS[i][j-1]); } } // Start Trace Back StringBuilder sb= new StringBuilder(); int i=Y.length ; int j=X.length ; while(i > 0 && j>0) { //System.out.println(i + " " + j); if(Y[i-1] == X[j-1]) { sb = sb.append(X[j-1]); i--; j--; } else { if(LCS[i][j-1]> LCS[i-1][j]) j--; else i--; } } System.out.println(sb.reverse()); return LCS[Y.length][X.length]; }
References:1. http://www.ics.uci.edu/~eppstein/161/960229.html
2. Refer NPTEL videos of design and algoritms (IIT Mumbai)
3. http://www.youtube.com/watch?v=wJ-rP9hJXO0
Please comment and post suggestions
Happy Coding!! :)