## 10679 - I Love Strings!!

Moderator: Board moderators

anupam
A great helper
Posts: 405
Joined: Wed Aug 28, 2002 6:45 pm
Contact:
Thanks Adrian, got it. Found the book, named here, also.
"Everything should be made simple, but not always simpler"

Revenger
Experienced poster
Posts: 132
Joined: Sun Apr 14, 2002 12:27 pm
Location: Russia
Can anyone explain me why my code (it was accepted 3 days later) gets TLE . I am using KMP

[pascal]{\$A-,B-,D+,E+,F-,G-,I-,L-,N-,O+,P-,Q-,R-,S-,T-,V-,X+}
const maxlen = 110000;

type my_string = record
len : longint;
sa : array[1..maxlen]of char;
end;

var ch : char;
begin
a.len := 0;
while true do begin
if (ch = #10) or (ch = #13) then break;
a.len := a.len + 1;
a.sa[a.len] := ch;
end;
end;

var t, ti, k, i, j, q : longint;
inps, tst : my_string;
f : array[1..2000]of longint;

begin
for ti := 1 to t do begin
for i := 1 to k do begin
if tst.len = 0 then begin
writeln('y');
continue;
end;
f[1] := 0;
j := 0;
for q := 2 to tst.len do begin
while (j > 0) and (tst.sa[j + 1] <> tst.sa[q]) do j := f[j];
if tst.sa[j + 1] = tst.sa[q] then j := j + 1;
f[q] := j;
end;
q := 0;
for j := 1 to inps.len do begin
while (q > 0) and (tst.sa[q + 1] <> inps.sa[j]) do q := f[q];
if tst.sa[q + 1] = inps.sa[j] then q := q + 1;
if q = tst.len then break;
end;
if q = tst.len then writeln('y') else writeln('n');
end;
end;
end.[/pascal]

Guru
Posts: 724
Joined: Wed Dec 19, 2001 2:00 am
Location: Germany
It seems the problem limits were changed. Now there are up to 1000 queries.
And with KMP you have to go through the text now up to 1000 times, and that is too much.
So it seems that now you have to use Aho-Corasick (with this algorithm it is sufficient to go through the text once for a given set of patterns).

Observer
Guru
Posts: 570
Joined: Sat May 10, 2003 4:20 am
Location: Hong Kong
Adrian Kuegel wrote:It seems the problem limits were changed. Now there are up to 1000 queries.
And with KMP you have to go through the text now up to 1000 times, and that is too much.
So it seems that now you have to use Aho-Corasick (with this algorithm it is sufficient to go through the text once for a given set of patterns).
Now I think that this question is not very meaningful... It's like nearly impossible to be solved without a good algorithm book at home...
7th Contest of Newbies
Date: December 31st, 2011 (Saturday)
Time: 12:00 - 16:00 (UTC)
URL: http://uva.onlinejudge.org

New poster
Posts: 9
Joined: Sat Jan 19, 2002 2:00 am
I have solved the problem with famous Ukkonen's (I hope the name is correct ) algorithm which is also known as suffix tree method. The original data was not very strong to differentiate between KMP and mine Also it had some leaks( )....Now they are fixed.....

htl
Experienced poster
Posts: 185
Joined: Fri Jun 28, 2002 12:05 pm
Location: Taipei, Taiwan
Could someone provide the resources on the internet about the two algorithms (aho and Ukkonen)? I tried hard and found few of them.

ulin
New poster
Posts: 15
Joined: Wed Jun 30, 2004 11:24 am
Location: POLAND (Plock)

http://www.cs.uku.fi/~kilpelai/BSA04/le ... rint04.pdf
www-sr.informatik.uni-tuebingen.de/~buehler/AC/AC1.html

second one is simulation of this algo. It helped me to understand.

Revenger
Experienced poster
Posts: 132
Joined: Sun Apr 14, 2002 12:27 pm
Location: Russia
Thanks, ulin! It was really helpful for me - I've got Accepted

Revenger
Experienced poster
Posts: 132
Joined: Sun Apr 14, 2002 12:27 pm
Location: Russia

Let's talk about 10679's tests. I think that they are very easy.

First of all, if in Aho-Corasick you compute failure function in dfs order you got accepted. That's very strage, because almost every randomized test check this.

Aho-Corasick have complexity O(m + z), where z is the number of pettern occurences. So, there are many tests where z = O(km). In such test cases Aho-Corasick algo is not better than KMP. But AC algo can be modifed in a such way, that it would have complexity O(m + k) for a current problem.

What's about memory? There is not any good test for checking memory usage. It is very hard to run under 32 MB. If you were not belive me try this test

[pascal]var i, j : longint;

begin
assign(output, '10679.in');
rewrite(output);

randseed := 1017358;

writeln(1);
for i := 1 to 100000 do write(chr(ord('a') + random(26))); writeln;
writeln(1000);
for i := 1 to 1000 do begin
write(chr(ord('a') + ((i div 676) mod 26)));
write(chr(ord('a') + ((i div 26) mod 26)));
write(chr(ord('a') + (i mod 26)));
for j := 4 to 1000 do write(chr(ord('a') + random(26))); writeln;
end;
end.[/pascal]

windows2k
Experienced poster
Posts: 136
Joined: Sat Apr 05, 2003 3:29 pm
Location: Taiwan
Adrian Kuegel wrote:It seems the problem limits were changed. Now there are up to 1000 queries.
And with KMP you have to go through the text now up to 1000 times, and that is too much.
So it seems that now you have to use Aho-Corasick (with this algorithm it is sufficient to go through the text once for a given set of patterns).
I have accepted the problem. Use the same algorithm with yours.
But it is much slower than your program.
How could you speed your program?
Thx

hoangmit
New poster
Posts: 1
Joined: Sun Jul 04, 2004 6:19 am
I've got troubles with reading input in the online judge.

Can anybody tell me why ?

[cpp]#include <iostream>
#include <cstdio>
#include <cassert>
#include <cctype>
#include <cstring>

using namespace std;

char text[1000002];
char pattern[1002];

int main()
{
//freopen("C:\\10679.in", "r", stdin);

int i, j, nTest, nQuery;

scanf("%d\n", &nTest);
for (i = 0; i < nTest; i++)
{
gets(text);
scanf("%d\n", &nQuery);
for (j = 0; j < nQuery; j++)
{
gets(pattern);
if (strlen(pattern) > 0)
assert(!isdigit(pattern[0])); // Must be wrong here
}
}
return 0;
}
[/cpp]

pminkov
New poster
Posts: 8
Joined: Wed Oct 17, 2001 2:00 am
Location: Bulgaria, Sofia
That's because someone introduced characters different than
alphabetic chars and digits (which are not allowed).
I supposed that this was messed up when resubmitting my
problem, but I was too tired to check what was wrong this
time. At least someone can fix the problem statement ...
Generally it's very frustrating to accept during a contest,
and then not being able to accept the problem in the online
judge , and this adds another degree of frustration

Guru
Posts: 724
Joined: Wed Dec 19, 2001 2:00 am
Location: Germany
Aho-Corasick have complexity O(m + z), where z is the number of pettern occurences. So, there are many tests where z = O(km). In such test cases Aho-Corasick algo is not better than KMP. But AC algo can be modifed in a such way, that it would have complexity O(m + k) for a current problem.
Yes, and I modified the algorithm for this problem so it is O(m+k).
What's about memory? There is not any good test for checking memory usage. It is very hard to run under 32 MB. If you were not belive me try this test
Of course I believe you, and my program also needs a lot of memory (more than 32 MB). Theoretically I need up to q*1000*55*4 bytes of memory.
How could you speed your program?
Because only lowercase and uppercase letters are allowed, you need only 52 pointers to childs in your tree. This is the only speedup I used.
That's because someone introduced characters different than
alphabetic chars and digits (which are not allowed).
Yes, I also found out that there exists other characters in the input.
My program handles all invalid input characters as if they are 'a' (of course not intentionally, only because it assumes that only valid characters occur).
As the problemsetter probably reads this thread, I hope the judge input will be fixed soon.

CrazyTerabyte
New poster
Posts: 25
Joined: Fri Jul 16, 2004 3:19 am
Location: Brazil
Contact:

### Time limit in 10679?

Hey, the problem says that time limit is 4 seconds, but I see at ranklist some people with more than 4 seconds (about 13 people, with, at most, 8.6 seconds).

Can someone explain why this?

http://acm.uva.es/cgi-bin/OnlineJudge?P ... :10679:200

Andrey Mokhov
Experienced poster
Posts: 128
Joined: Fri Nov 15, 2002 7:45 am
Location: Kyrgyzstan
Hi!

Time limits in problems means the time limits that were in law during contest. Strangly enough after contest all the problems get default
time limit that is 10 seconds. Quite bad, I guess. I won't get AC on this problem if TL were 4 seconds and would search for a better algo

Maybe in this particular problem the changing of time limit from 4 to 10 seconds can be explained as the data set and problem statement has changed after contest. But they at least should change the title stating TL... but you know someone has to do it... and there are not lots of people who is doing such things - they can't do everything in a moment.