10829 - L-Gap Substrings
Moderator: Board moderators
-
- New poster
- Posts: 15
- Joined: Thu Oct 16, 2003 7:43 pm
- Location: M
- Contact:
10829 - L-Gap Substrings
Any hint for solving this problem ??
In the worst case (a string with 50,000 same character), the solution is greater than 500,000,000 L-GAP. O.K. We can manage that case.
I think, even using KMP's algrithm is not possible count all L-GAPs in time
Please, help.
In the worst case (a string with 50,000 same character), the solution is greater than 500,000,000 L-GAP. O.K. We can manage that case.
I think, even using KMP's algrithm is not possible count all L-GAPs in time
Please, help.
-
- New poster
- Posts: 15
- Joined: Thu Oct 16, 2003 7:43 pm
- Location: M
- Contact:
A g-Gap is defined as a string in the form UVU such that |V| = g.
For example: AALLAA is a 2-Gap taking U = AA and V = LL.
furthermore is 4-GAP because of U = A and V = ALLA and |V| = 4,
finally is not a 6-GAP, because U can not be empty.
The problem consists in given a string S, and a positive integer G, count all substrings S' in S such that S' is a G-Gap.
For example: AALLAA is a 2-Gap taking U = AA and V = LL.
furthermore is 4-GAP because of U = A and V = ALLA and |V| = 4,
finally is not a 6-GAP, because U can not be empty.
The problem consists in given a string S, and a positive integer G, count all substrings S' in S such that S' is a G-Gap.
-
- New poster
- Posts: 19
- Joined: Thu May 02, 2002 5:47 am
- Location: Zhongshan (Sun Yat-sen) University
-
- Guru
- Posts: 724
- Joined: Wed Dec 19, 2001 2:00 am
- Location: Germany
I only solved this problem because for an bioinformatics course I had read a paper which described an algorithm how to find branching tandem repeats. I modified this algorithm for this problem. I know that the problemsetter had a different solution, he told me it is something with divide and conquer and extended kmp (knuth morris pratt algorithm) (but I don't know what he meant with extended).
Re: 10829 - L-Gap Substrings
to Adrian: did you store all the (i,D) pairs -- L-gap tandem repeats -- explicitly? I'm going to use an O(nlogn) algo for branching tandem repeats
Re: 10829 - L-Gap Substrings
Never mind, the question above is silly: we are asked to find the number of occurences, not the number of distinct substrings.
If there is ever a war between men and machines, it is easy to guess who will start it (c) Arthur Clarke
Re: 10829 - L-Gap Substrings
I find all branching tandem repeats with gap 0 <= g <= L and length > L-g in O(L*n*log(n)).
Specifically, for my code finds pairs
but fails to find because the corresponding tandem repeat (g = 0) is not branching.
The question is: how to handle non-branching repeats in O(nlogn)?
Specifically, for
Code: Select all
bbaabaaaaa$
Code: Select all
(6,10), (3,7), (8,10), (4,6), (7,9) and (2,6)
Code: Select all
(6,8),
The question is: how to handle non-branching repeats in O(nlogn)?
Re: 10829 - L-Gap Substrings
for the testcase the output size is
which amounts to O(n^2). Now, given we can find all branching tandem repeats in O(nlogn), how
to handle non-branching tandem repeats?
Code: Select all
s = a^n
Code: Select all
(n-L-1) + (n-L-3) + (n-L-5) + ...,
to handle non-branching tandem repeats?
If there is ever a war between men and machines, it is easy to guess who will start it (c) Arthur Clarke
Re: 10829 - L-Gap Substrings
Well, with Adrian's help I managed to get AC with an O(|GAP|n(logn)^2) algo, and I can't see as yet the way of shortening this to O(nlogn).
My code is 580
lines long, pure C code.
Many thanks to Adrian and to Rujia!
My code is 580
![:wink:](./images/smilies/icon_wink.gif)
Many thanks to Adrian and to Rujia!
If there is ever a war between men and machines, it is easy to guess who will start it (c) Arthur Clarke
-
- New poster
- Posts: 1
- Joined: Wed Oct 31, 2012 5:21 pm
Re: 10829 - L-Gap Substrings
can anyone give me some hint for this problem?i don't know how to solve this? as i think ,may be suffix array,but i don't know how to deal with the gap?can anyone help me? thanks very much! my email:luyuncheng@sina.com