10679 - I Love Strings!!

All about problems in Volume 106. If there is a thread about your problem, please use it. If not, create one with its number in the subject.

Moderator: Board moderators

anupam
A great helper
Posts: 405
Joined: Wed Aug 28, 2002 6:45 pm
Contact:

Post by anupam »

Thanks Adrian, got it. Found the book, named here, also.
"Everything should be made simple, but not always simpler"

Revenger
Experienced poster
Posts: 132
Joined: Sun Apr 14, 2002 12:27 pm
Location: Russia

Post by Revenger »

Can anyone explain me why my code (it was accepted 3 days later) gets TLE :evil: . I am using KMP

[pascal]{$A-,B-,D+,E+,F-,G-,I-,L-,N-,O+,P-,Q-,R-,S-,T-,V-,X+}
const maxlen = 110000;

type my_string = record
len : longint;
sa : array[1..maxlen]of char;
end;

procedure read_str(var a : my_string);
var ch : char;
begin
a.len := 0;
while true do begin
read(ch);
if (ch = #10) or (ch = #13) then break;
a.len := a.len + 1;
a.sa[a.len] := ch;
end;
end;

var t, ti, k, i, j, q : longint;
inps, tst : my_string;
f : array[1..2000]of longint;

begin
readln(t);
for ti := 1 to t do begin
read_str(inps);
readln(k);
for i := 1 to k do begin
read_str(tst);
if tst.len = 0 then begin
writeln('y');
continue;
end;
f[1] := 0;
j := 0;
for q := 2 to tst.len do begin
while (j > 0) and (tst.sa[j + 1] <> tst.sa[q]) do j := f[j];
if tst.sa[j + 1] = tst.sa[q] then j := j + 1;
f[q] := j;
end;
q := 0;
for j := 1 to inps.len do begin
while (q > 0) and (tst.sa[q + 1] <> inps.sa[j]) do q := f[q];
if tst.sa[q + 1] = inps.sa[j] then q := q + 1;
if q = tst.len then break;
end;
if q = tst.len then writeln('y') else writeln('n');
end;
end;
end.[/pascal]

Adrian Kuegel
Guru
Posts: 724
Joined: Wed Dec 19, 2001 2:00 am
Location: Germany

Post by Adrian Kuegel »

It seems the problem limits were changed. Now there are up to 1000 queries.
And with KMP you have to go through the text now up to 1000 times, and that is too much.
So it seems that now you have to use Aho-Corasick (with this algorithm it is sufficient to go through the text once for a given set of patterns).

Observer
Guru
Posts: 570
Joined: Sat May 10, 2003 4:20 am
Location: Hong Kong

Post by Observer »

Adrian Kuegel wrote:It seems the problem limits were changed. Now there are up to 1000 queries.
And with KMP you have to go through the text now up to 1000 times, and that is too much.
So it seems that now you have to use Aho-Corasick (with this algorithm it is sufficient to go through the text once for a given set of patterns).
Now I think that this question is not very meaningful... It's like nearly impossible to be solved without a good algorithm book at home... :(
7th Contest of Newbies
Date: December 31st, 2011 (Saturday)
Time: 12:00 - 16:00 (UTC)
URL: http://uva.onlinejudge.org

Md. Sajjad Hossain
New poster
Posts: 9
Joined: Sat Jan 19, 2002 2:00 am
Location: Bangladesh

Post by Md. Sajjad Hossain »

I have solved the problem with famous Ukkonen's (I hope the name is correct :P) algorithm which is also known as suffix tree method. The original data was not very strong to differentiate between KMP and mine :) Also it had some leaks( :o )....Now they are fixed.....
--sajjad

htl
Experienced poster
Posts: 185
Joined: Fri Jun 28, 2002 12:05 pm
Location: Taipei, Taiwan

Post by htl »

Could someone provide the resources on the internet about the two algorithms (aho and Ukkonen)? I tried hard and found few of them.

ulin
New poster
Posts: 15
Joined: Wed Jun 30, 2004 11:24 am
Location: POLAND (Plock)

Post by ulin »

This links should be helpful

http://www.cs.uku.fi/~kilpelai/BSA04/le ... rint04.pdf
www-sr.informatik.uni-tuebingen.de/~buehler/AC/AC1.html

first one is about Aho-Corasic
second one is simulation of this algo. It helped me to understand.

Revenger
Experienced poster
Posts: 132
Joined: Sun Apr 14, 2002 12:27 pm
Location: Russia

Post by Revenger »

Thanks, ulin! It was really helpful for me - I've got Accepted :)

Revenger
Experienced poster
Posts: 132
Joined: Sun Apr 14, 2002 12:27 pm
Location: Russia

Bad test data

Post by Revenger »

Let's talk about 10679's tests. I think that they are very easy.

First of all, if in Aho-Corasick you compute failure function in dfs order you got accepted. That's very strage, because almost every randomized test check this.

Aho-Corasick have complexity O(m + z), where z is the number of pettern occurences. So, there are many tests where z = O(km). In such test cases Aho-Corasick algo is not better than KMP. But AC algo can be modifed in a such way, that it would have complexity O(m + k) for a current problem.

What's about memory? There is not any good test for checking memory usage. It is very hard to run under 32 MB. If you were not belive me try this test

[pascal]var i, j : longint;

begin
assign(output, '10679.in');
rewrite(output);

randseed := 1017358;

writeln(1);
for i := 1 to 100000 do write(chr(ord('a') + random(26))); writeln;
writeln(1000);
for i := 1 to 1000 do begin
write(chr(ord('a') + ((i div 676) mod 26)));
write(chr(ord('a') + ((i div 26) mod 26)));
write(chr(ord('a') + (i mod 26)));
for j := 4 to 1000 do write(chr(ord('a') + random(26))); writeln;
end;
end.[/pascal]

My program needs about 200 MB to solve this test. What about your program?

windows2k
Experienced poster
Posts: 136
Joined: Sat Apr 05, 2003 3:29 pm
Location: Taiwan

Post by windows2k »

Adrian Kuegel wrote:It seems the problem limits were changed. Now there are up to 1000 queries.
And with KMP you have to go through the text now up to 1000 times, and that is too much.
So it seems that now you have to use Aho-Corasick (with this algorithm it is sufficient to go through the text once for a given set of patterns).
Hello, Adrian Kuegel .
I have accepted the problem. Use the same algorithm with yours.
But it is much slower than your program.
How could you speed your program?
Thx :D

hoangmit
New poster
Posts: 1
Joined: Sun Jul 04, 2004 6:19 am

Post by hoangmit »

I've got troubles with reading input in the online judge. :oops:

Can anybody tell me why ?

[cpp]#include <iostream>
#include <cstdio>
#include <cassert>
#include <cctype>
#include <cstring>

using namespace std;

char text[1000002];
char pattern[1002];

int main()
{
//freopen("C:\\10679.in", "r", stdin);

int i, j, nTest, nQuery;

scanf("%d\n", &nTest);
for (i = 0; i < nTest; i++)
{
gets(text);
scanf("%d\n", &nQuery);
for (j = 0; j < nQuery; j++)
{
gets(pattern);
if (strlen(pattern) > 0)
assert(!isdigit(pattern[0])); // Must be wrong here :-?
}
}
return 0;
}
[/cpp]

pminkov
New poster
Posts: 8
Joined: Wed Oct 17, 2001 2:00 am
Location: Bulgaria, Sofia

Post by pminkov »

That's because someone introduced characters different than
alphabetic chars and digits (which are not allowed).
I supposed that this was messed up when resubmitting my
problem, but I was too tired to check what was wrong this
time. At least someone can fix the problem statement ... :(
Generally it's very frustrating to accept during a contest,
and then not being able to accept the problem in the online
judge :(, and this adds another degree of frustration :(

Adrian Kuegel
Guru
Posts: 724
Joined: Wed Dec 19, 2001 2:00 am
Location: Germany

Post by Adrian Kuegel »

Aho-Corasick have complexity O(m + z), where z is the number of pettern occurences. So, there are many tests where z = O(km). In such test cases Aho-Corasick algo is not better than KMP. But AC algo can be modifed in a such way, that it would have complexity O(m + k) for a current problem.
Yes, and I modified the algorithm for this problem so it is O(m+k).
What's about memory? There is not any good test for checking memory usage. It is very hard to run under 32 MB. If you were not belive me try this test
Of course I believe you, and my program also needs a lot of memory (more than 32 MB). Theoretically I need up to q*1000*55*4 bytes of memory.
How could you speed your program?
Because only lowercase and uppercase letters are allowed, you need only 52 pointers to childs in your tree. This is the only speedup I used.
That's because someone introduced characters different than
alphabetic chars and digits (which are not allowed).
Yes, I also found out that there exists other characters in the input.
My program handles all invalid input characters as if they are 'a' (of course not intentionally, only because it assumes that only valid characters occur).
As the problemsetter probably reads this thread, I hope the judge input will be fixed soon.

CrazyTerabyte
New poster
Posts: 25
Joined: Fri Jul 16, 2004 3:19 am
Location: Brazil
Contact:

Time limit in 10679?

Post by CrazyTerabyte »

Hey, the problem says that time limit is 4 seconds, but I see at ranklist some people with more than 4 seconds (about 13 people, with, at most, 8.6 seconds).

Can someone explain why this?

http://acm.uva.es/cgi-bin/OnlineJudge?P ... :10679:200

Andrey Mokhov
Experienced poster
Posts: 128
Joined: Fri Nov 15, 2002 7:45 am
Location: Kyrgyzstan

Post by Andrey Mokhov »

Hi!

Time limits in problems means the time limits that were in law during contest. Strangly enough after contest all the problems get default
time limit that is 10 seconds. Quite bad, I guess. I won't get AC on this problem if TL were 4 seconds and would search for a better algo :roll: :roll:

Maybe in this particular problem the changing of time limit from 4 to 10 seconds can be explained as the data set and problem statement has changed after contest. But they at least should change the title stating TL... but you know someone has to do it... and there are not lots of people who is doing such things - they can't do everything in a moment.

Buy.
Andrey.

Post Reply

Return to “Volume 106 (10600-10699)”