Page 2 of 7

Posted: Wed Jun 30, 2004 6:30 pm
by anupam
Thanks Adrian, got it. Found the book, named here, also.

Posted: Wed Jun 30, 2004 7:05 pm
by Revenger
Can anyone explain me why my code (it was accepted 3 days later) gets TLE :evil: . I am using KMP

[pascal]{$A-,B-,D+,E+,F-,G-,I-,L-,N-,O+,P-,Q-,R-,S-,T-,V-,X+}
const maxlen = 110000;

type my_string = record
len : longint;
sa : array[1..maxlen]of char;
end;

procedure read_str(var a : my_string);
var ch : char;
begin
a.len := 0;
while true do begin
read(ch);
if (ch = #10) or (ch = #13) then break;
a.len := a.len + 1;
a.sa[a.len] := ch;
end;
end;

var t, ti, k, i, j, q : longint;
inps, tst : my_string;
f : array[1..2000]of longint;

begin
readln(t);
for ti := 1 to t do begin
read_str(inps);
readln(k);
for i := 1 to k do begin
read_str(tst);
if tst.len = 0 then begin
writeln('y');
continue;
end;
f[1] := 0;
j := 0;
for q := 2 to tst.len do begin
while (j > 0) and (tst.sa[j + 1] <> tst.sa[q]) do j := f[j];
if tst.sa[j + 1] = tst.sa[q] then j := j + 1;
f[q] := j;
end;
q := 0;
for j := 1 to inps.len do begin
while (q > 0) and (tst.sa[q + 1] <> inps.sa[j]) do q := f[q];
if tst.sa[q + 1] = inps.sa[j] then q := q + 1;
if q = tst.len then break;
end;
if q = tst.len then writeln('y') else writeln('n');
end;
end;
end.[/pascal]

Posted: Wed Jun 30, 2004 8:45 pm
by Adrian Kuegel
It seems the problem limits were changed. Now there are up to 1000 queries.
And with KMP you have to go through the text now up to 1000 times, and that is too much.
So it seems that now you have to use Aho-Corasick (with this algorithm it is sufficient to go through the text once for a given set of patterns).

Posted: Thu Jul 01, 2004 7:22 am
by Observer
Adrian Kuegel wrote:It seems the problem limits were changed. Now there are up to 1000 queries.
And with KMP you have to go through the text now up to 1000 times, and that is too much.
So it seems that now you have to use Aho-Corasick (with this algorithm it is sufficient to go through the text once for a given set of patterns).
Now I think that this question is not very meaningful... It's like nearly impossible to be solved without a good algorithm book at home... :(

Posted: Thu Jul 01, 2004 7:32 am
by Md. Sajjad Hossain
I have solved the problem with famous Ukkonen's (I hope the name is correct :P) algorithm which is also known as suffix tree method. The original data was not very strong to differentiate between KMP and mine :) Also it had some leaks( :o )....Now they are fixed.....
--sajjad

Posted: Thu Jul 01, 2004 11:18 am
by htl
Could someone provide the resources on the internet about the two algorithms (aho and Ukkonen)? I tried hard and found few of them.

Posted: Thu Jul 01, 2004 6:16 pm
by ulin
This links should be helpful

http://www.cs.uku.fi/~kilpelai/BSA04/le ... rint04.pdf
www-sr.informatik.uni-tuebingen.de/~buehler/AC/AC1.html

first one is about Aho-Corasic
second one is simulation of this algo. It helped me to understand.

Posted: Fri Jul 02, 2004 9:20 am
by Revenger
Thanks, ulin! It was really helpful for me - I've got Accepted :)

Bad test data

Posted: Fri Jul 02, 2004 10:24 am
by Revenger
Let's talk about 10679's tests. I think that they are very easy.

First of all, if in Aho-Corasick you compute failure function in dfs order you got accepted. That's very strage, because almost every randomized test check this.

Aho-Corasick have complexity O(m + z), where z is the number of pettern occurences. So, there are many tests where z = O(km). In such test cases Aho-Corasick algo is not better than KMP. But AC algo can be modifed in a such way, that it would have complexity O(m + k) for a current problem.

What's about memory? There is not any good test for checking memory usage. It is very hard to run under 32 MB. If you were not belive me try this test

[pascal]var i, j : longint;

begin
assign(output, '10679.in');
rewrite(output);

randseed := 1017358;

writeln(1);
for i := 1 to 100000 do write(chr(ord('a') + random(26))); writeln;
writeln(1000);
for i := 1 to 1000 do begin
write(chr(ord('a') + ((i div 676) mod 26)));
write(chr(ord('a') + ((i div 26) mod 26)));
write(chr(ord('a') + (i mod 26)));
for j := 4 to 1000 do write(chr(ord('a') + random(26))); writeln;
end;
end.[/pascal]

My program needs about 200 MB to solve this test. What about your program?

Posted: Fri Jul 02, 2004 4:12 pm
by windows2k
Adrian Kuegel wrote:It seems the problem limits were changed. Now there are up to 1000 queries.
And with KMP you have to go through the text now up to 1000 times, and that is too much.
So it seems that now you have to use Aho-Corasick (with this algorithm it is sufficient to go through the text once for a given set of patterns).
Hello, Adrian Kuegel .
I have accepted the problem. Use the same algorithm with yours.
But it is much slower than your program.
How could you speed your program?
Thx :D

Posted: Sun Jul 04, 2004 6:45 am
by hoangmit
I've got troubles with reading input in the online judge. :oops:

Can anybody tell me why ?

[cpp]#include <iostream>
#include <cstdio>
#include <cassert>
#include <cctype>
#include <cstring>

using namespace std;

char text[1000002];
char pattern[1002];

int main()
{
//freopen("C:\\10679.in", "r", stdin);

int i, j, nTest, nQuery;

scanf("%d\n", &nTest);
for (i = 0; i < nTest; i++)
{
gets(text);
scanf("%d\n", &nQuery);
for (j = 0; j < nQuery; j++)
{
gets(pattern);
if (strlen(pattern) > 0)
assert(!isdigit(pattern[0])); // Must be wrong here :-?
}
}
return 0;
}
[/cpp]

Posted: Sun Jul 04, 2004 11:56 am
by pminkov
That's because someone introduced characters different than
alphabetic chars and digits (which are not allowed).
I supposed that this was messed up when resubmitting my
problem, but I was too tired to check what was wrong this
time. At least someone can fix the problem statement ... :(
Generally it's very frustrating to accept during a contest,
and then not being able to accept the problem in the online
judge :(, and this adds another degree of frustration :(

Posted: Sun Jul 04, 2004 4:39 pm
by Adrian Kuegel
Aho-Corasick have complexity O(m + z), where z is the number of pettern occurences. So, there are many tests where z = O(km). In such test cases Aho-Corasick algo is not better than KMP. But AC algo can be modifed in a such way, that it would have complexity O(m + k) for a current problem.
Yes, and I modified the algorithm for this problem so it is O(m+k).
What's about memory? There is not any good test for checking memory usage. It is very hard to run under 32 MB. If you were not belive me try this test
Of course I believe you, and my program also needs a lot of memory (more than 32 MB). Theoretically I need up to q*1000*55*4 bytes of memory.
How could you speed your program?
Because only lowercase and uppercase letters are allowed, you need only 52 pointers to childs in your tree. This is the only speedup I used.
That's because someone introduced characters different than
alphabetic chars and digits (which are not allowed).
Yes, I also found out that there exists other characters in the input.
My program handles all invalid input characters as if they are 'a' (of course not intentionally, only because it assumes that only valid characters occur).
As the problemsetter probably reads this thread, I hope the judge input will be fixed soon.

Time limit in 10679?

Posted: Wed Jul 28, 2004 1:16 am
by CrazyTerabyte
Hey, the problem says that time limit is 4 seconds, but I see at ranklist some people with more than 4 seconds (about 13 people, with, at most, 8.6 seconds).

Can someone explain why this?

http://acm.uva.es/cgi-bin/OnlineJudge?P ... :10679:200

Posted: Wed Jul 28, 2004 8:09 am
by Andrey Mokhov
Hi!

Time limits in problems means the time limits that were in law during contest. Strangly enough after contest all the problems get default
time limit that is 10 seconds. Quite bad, I guess. I won't get AC on this problem if TL were 4 seconds and would search for a better algo :roll: :roll:

Maybe in this particular problem the changing of time limit from 4 to 10 seconds can be explained as the data set and problem statement has changed after contest. But they at least should change the title stating TL... but you know someone has to do it... and there are not lots of people who is doing such things - they can't do everything in a moment.

Buy.
Andrey.