Page **1** of **1**

### 10829 - L-Gap Substrings

Posted: **Tue Mar 08, 2005 12:33 pm**

by **filigabriel**

Any hint for solving this problem ??

In the worst case (a string with 50,000 same character), the solution is greater than **500,000,000** L-GAP. O.K. We can manage that case.

I think, even using KMP's algrithm is not possible count all L-GAPs in time

Please, help.

Posted: **Sat Mar 12, 2005 10:28 am**

by **ibrahim**

hi filigabriel,

I can't understand the porblem. Can you help me?

Posted: **Sat Mar 12, 2005 9:07 pm**

by **filigabriel**

A g-Gap is defined as a string in the form UVU such that |V| = g.

For example: AALLAA is a 2-Gap taking U = AA and V = LL.

furthermore is 4-GAP because of U = A and V = ALLA and |V| = 4,

finally is not a 6-GAP, because U can not be empty.

The problem consists in given a string **S**, and a positive integer **G**, count all substrings **S'** in **S** such that **S'** is a **G**-Gap.

Posted: **Sun Mar 13, 2005 3:44 am**

by **ibrahim**

Thanks filigabriel, i got the problem.

You got any hints????

Regards,

Ibrahim

Posted: **Thu Mar 17, 2005 11:16 am**

by **ardiankp**

Hmm.. still no hint? I also stuck in this problems..

any help will be appreciated

Thx

Posted: **Mon May 02, 2005 10:22 am**

by **Gigi's Baby**

Any hints ?

I could only get an O(n^2) method, but it surely got TLE.

Posted: **Fri Aug 26, 2005 4:23 am**

by **polone**

hmm...

suffix tree is not bad

we can count L-Gap strings during making the tree

I think it should work

I also wonder how the guys on ranklist deal this problem..

Posted: **Fri Aug 26, 2005 11:23 pm**

by **Adrian Kuegel**

I only solved this problem because for an bioinformatics course I had read a paper which described an algorithm how to find branching tandem repeats. I modified this algorithm for this problem. I know that the problemsetter had a different solution, he told me it is something with divide and conquer and extended kmp (knuth morris pratt algorithm) (but I don't know what he meant with extended).

Posted: **Tue Aug 30, 2005 12:54 am**

by **polch**

!!!!!!!!!!!!!!! SPOILER - ATTENTION !!!!!!!!!!!!!!!!!!!!!!!!!!!

Hi,I solved this task using KMR + binary search in O(n*(logn)^2).

For each k being length of U I counted all subwords UVU of text in O(n/k*logk). (|V|=g)

Adrian, how fast (asympt.) algo have you got?

### Re: 10829 - L-Gap Substrings

Posted: **Thu May 28, 2009 3:58 pm**

by **serur**

to Adrian: did you store all the (i,D) pairs -- L-gap tandem repeats -- explicitly? I'm going to use an O(nlogn) algo for branching tandem repeats

### Re: 10829 - L-Gap Substrings

Posted: **Fri May 29, 2009 8:50 pm**

by **serur**

Never mind, the question above is silly: we are asked to find the number of occurences, not the number of distinct substrings.

### Re: 10829 - L-Gap Substrings

Posted: **Sat May 30, 2009 9:11 am**

by **serur**

I find all branching tandem repeats with gap 0 <= g <= L and length > L-g in O(L*n*log(n)).

Specifically, for

my code finds pairs

Code: Select all

`(6,10), (3,7), (8,10), (4,6), (7,9) and (2,6)`

but fails to find

because the corresponding tandem repeat (g = 0) is not branching.

The question is: how to handle non-branching repeats in O(nlogn)?

### Re: 10829 - L-Gap Substrings

Posted: **Sun May 31, 2009 11:30 am**

by **serur**

for the testcase

the output size is

Code: Select all

`(n-L-1) + (n-L-3) + (n-L-5) + ...,`

which amounts to O(n^2). Now, given we can find all branching tandem repeats in O(nlogn), how

to handle

**non**-branching tandem repeats?

### Re: 10829 - L-Gap Substrings

Posted: **Sat Jun 06, 2009 8:05 pm**

by **serur**

Well, with Adrian's help I managed to get AC with an O(|GAP|n(logn)^2) algo, and I can't see as yet the way of shortening this to O(nlogn).

My code is 580

lines long, pure C code.

Many thanks to Adrian and to Rujia!

### Re: 10829 - L-Gap Substrings

Posted: **Thu Apr 04, 2013 9:43 am**

by **luyuncheng**

can anyone give me some hint for this problem?i don't know how to solve this? as i think ,may be suffix array,but i don't know how to deal with the gap?can anyone help me? thanks very much! my email:

luyuncheng@sina.com