10679 - I Love Strings!!
Moderator: Board moderators
10679 - I Love Strings!!
I was trying to solve this problem with the naive solution (using strstr).
It gives TLE.
I've created a huge test file and the naive solution solves that huge test file very fast.
So... has the judge a lot of test files ?
I've few motivation to implement a better algorithm when I can't see the naive is slow.
By the way, if strstr is well implemented, a better solution is hard to implement considering there are only 100 queries in a single string, isn't it ?
It gives TLE.
I've created a huge test file and the naive solution solves that huge test file very fast.
So... has the judge a lot of test files ?
I've few motivation to implement a better algorithm when I can't see the naive is slow.
By the way, if strstr is well implemented, a better solution is hard to implement considering there are only 100 queries in a single string, isn't it ?
-
- Learning poster
- Posts: 63
- Joined: Mon Dec 22, 2003 5:05 am
- Location: Russia, Chelyabinsk
- Contact:
Re: 10679
All naive solution have complexity O(n*m) in worst case,
where n - lenght of base string, and m - length of substring.
What for this problem n=100000 and m=1000, O(n*m)=O(100 000 000) multiple it by number of queries and by number of test cases you get
VERY BIG complexity.
Use O(n+m) algo. For example, Knuth-Morris-Pratt.
-best,
Pavel
where n - lenght of base string, and m - length of substring.
What for this problem n=100000 and m=1000, O(n*m)=O(100 000 000) multiple it by number of queries and by number of test cases you get
VERY BIG complexity.
Use O(n+m) algo. For example, Knuth-Morris-Pratt.
-best,
Pavel
-
- Guru
- Posts: 647
- Joined: Wed Jun 26, 2002 10:12 pm
- Location: Hong Kong and New York City
- Contact:
This topic is well-researched. I ended up using Turbo Boyer-Moore. I thought it was slightly unforunate that it became a "guess the algorithm" contest, because my first submission was Ukkonen's algorithm with a suffix tree, but even that got TLE.. (during the contest..)
I'm curious as to what kind of algorithm people used..
I'm curious as to what kind of algorithm people used..
-
- Guru
- Posts: 724
- Joined: Wed Dec 19, 2001 2:00 am
- Location: Germany
-
- Guru
- Posts: 724
- Joined: Wed Dec 19, 2001 2:00 am
- Location: Germany
All patterns are stored in one tree. Then failure links are created (that have same meaning as in KMP). And then for each node, the first failure link that points to a pattern is recorded as "output link". This is necessary because some pattern may be sub-pattern of a bigger pattern and would not be found otherwise.
Example:
2 patterns: string, in
the tree looks like this:
I numbered the nodes in bfs order. This order is necessary to build the failure links.
Failure link of some node is failure link of parent + last read character,
or if this is not a node in the tree, it is the failure link of node pointed to by failure link of parent + last read character ... until either a node in the tree is found or the root is reached.
So in the example, there is a failure link from 6 to 2, from 7 to 4, and from all other nodes to the root.
When we are matching a text, it is sufficient to go through the text once.
For each character in the string, we try to follow an edge in the tree. If there is no such edge, follow a failure link (like in KMP). For each reached node, follow the output link in order not to miss a string to output.
Lets search in the string "strings".
We follow the edges in the tree 0->1, 1->3, 3->5, 5->6, 6->7
now we reach output link to 4 and see that we matched the pattern "in".
then we continue with 7 -> 8. Now we have matched the pattern "string".
and then we have to take failure link, because there is no edge labeled 's'.
I think you will also find some information in the internet if you use google.
Example:
2 patterns: string, in
the tree looks like this:
Code: Select all
______
| |
s(1) i(2)
| |
t(3) n(4)
|
r(5)
|
i(6)
|
n(7)
|
g(8)
Failure link of some node is failure link of parent + last read character,
or if this is not a node in the tree, it is the failure link of node pointed to by failure link of parent + last read character ... until either a node in the tree is found or the root is reached.
So in the example, there is a failure link from 6 to 2, from 7 to 4, and from all other nodes to the root.
When we are matching a text, it is sufficient to go through the text once.
For each character in the string, we try to follow an edge in the tree. If there is no such edge, follow a failure link (like in KMP). For each reached node, follow the output link in order not to miss a string to output.
Lets search in the string "strings".
We follow the edges in the tree 0->1, 1->3, 3->5, 5->6, 6->7
now we reach output link to 4 and see that we matched the pattern "in".
then we continue with 7 -> 8. Now we have matched the pattern "string".
and then we have to take failure link, because there is no edge labeled 's'.
I think you will also find some information in the internet if you use google.
poor data set
I applied Boyer's Moore and got AC in 0.806 seconds..
.. but after reversing and using strstr took only 0.189 s
I am waiting for a rejudge eventhough I am currently ranked 2 in this problem.
![:-?](./images/smilies/icon_confused.gif)
.. but after reversing and using strstr took only 0.189 s
I am waiting for a rejudge eventhough I am currently ranked 2 in this problem.
![:-?](./images/smilies/icon_confused.gif)
Re: poor data set
Well, I don't think that the "reversing" technique is so bad after all. It is not easy to come up with this idea during contest time! One has to be really smart to think of this.sohel wrote:I applied Boyer's Moore and got AC in 0.806 seconds..
.. but after reversing and using strstr took only 0.189 s
I am waiting for a rejudge eventhough I am currently ranked 2 in this problem.
![:D](./images/smilies/icon_biggrin.gif)
Anyway, it would be great if I could learn the advanced algorithms you guys mentioned above. It takes time, of course
![:wink:](./images/smilies/icon_wink.gif)
7th Contest of Newbies
Date: December 31st, 2011 (Saturday)
Time: 12:00 - 16:00 (UTC)
URL: http://uva.onlinejudge.org
Date: December 31st, 2011 (Saturday)
Time: 12:00 - 16:00 (UTC)
URL: http://uva.onlinejudge.org
odd
Something very interesting :
After reversing the strings and using strstr() took .187 seconds...
.. later I thought of a better idea.. ie reverse then apply Boyer Moore.
.. but that gave TLE..
So it seems that for some cases strstr is better than Boyer Moore.
.. can any one tell me the reason why I get TLE with the modified method.
![:-?](./images/smilies/icon_confused.gif)
After reversing the strings and using strstr() took .187 seconds...
.. later I thought of a better idea.. ie reverse then apply Boyer Moore.
.. but that gave TLE..
So it seems that for some cases strstr is better than Boyer Moore.
.. can any one tell me the reason why I get TLE with the modified method.
![:-?](./images/smilies/icon_confused.gif)