## 11019 - Matrix Matcher

All about problems in Volume 110. If there is a thread about your problem, please use it. If not, create one with its number in the subject.

Moderator: Board moderators

Cho
A great helper
Posts: 274
Joined: Wed Oct 20, 2004 11:51 pm
Location: Hong Kong
andresw1 wrote:You have very nice idea but I have some doubts that this will give TLE. After you find the same hash as the patern you will need to do the slow compare to see if the matching is valid. For the test case that krijger mentioned (a lot of 'A's) this will give TLE. Do you do something else which I missed? Do you have AC?
You might misunderstand my idea. My run time is independent to the string content.
david wrote:
Cho wrote:Let c[j] be the character of the shorter string, I define its hash value as:
Summation over all valid i,j: c[j] * (3 ^ (i*100+j)) mod 2^32
(x^y denotes x to the power y instead of bitwise-XOR.)

Are you sure you didn't mean, for example,
c[j] * (32 ^ (i*100+j)) mod 2^32 ?
The hash function you wrote will fail to return different values for such simple strings as "ab" and "da" (assuming a is coded as 0), so I can't believe you got AC doing that without later checking if the match is exact (in which case you should get TLE for cases where both the matrix and the pattern are full of a's, for instance).

Yes, you are right. I should use 27 instead of 3. Forfunately, there are only 15 test cases. And my bug can only be caught if you decide one for it specifically.

david wrote:I tried to use as a hash funtion the summation over all valid i, j of
c[j] * (32 ^(i * pattern width + j)) mod 1000000007,
but I got WA after 9.7s, so even if this approach can be made to work I wonder how you managed to get it accepted in about 2.5 seconds.

I just tried your method and got AC in 7 secs. Make sure your arithmetic part is computed correctly.

fmounir
New poster
Posts: 2
Joined: Fri Mar 24, 2006 11:05 pm
Hello Cho,

How can you compute the hash values of all X*Y patterns in the large text in a constant time? Or at least in an O(n*m)?

Thanks.

david
Learning poster
Posts: 83
Joined: Mon Apr 21, 2003 10:14 pm
Cho wrote: I just tried your method and got AC in 7 secs. Make sure your arithmetic part is computed correctly.
You're right, that's enough to get AC. I mistakenly took X and Y to be the pattern width and height, respectively (this choice of letters in the problem statement may be a bit misleading).
Anyway, I still don't know how to make it run below the 5 seconds time limit set during the contest. Was hashing the intended solution to this problem?
fmounir wrote: How can you compute the hash values of all X*Y patterns in the large text in a constant time? Or at least in an O(n*m)?
This can be computed in O(n*m) by first storing the hash values of all X*1 patterns, and then computing the hash of an X*Y pattern based on the hash of the X*Y pattern just above it, along with the hash of the first line.

andresw1
New poster
Posts: 27
Joined: Sat Mar 18, 2006 5:04 pm
Hello,

So the solution is based on hashing the patern? What if we have the same hash for a different set of letters ot this is impossible?

david
Learning poster
Posts: 83
Joined: Mon Apr 21, 2003 10:14 pm
andresw1 wrote:Hello,
So the solution is based on hashing the patern? What if we have the same hash for a different set of letters ot this is impossible?
It is perfectly possible, but it is not likely that the judge test data will contain a case that will cause such a solution to fail.
The hashing solution would indeed be correct if we were not forced to take a modulus to make the result fit into an integer. Computing the hash value modulo two (or more) different primes would give you much better reliability, although this would most probably run beyond the time limit.

andresw1
New poster
Posts: 27
Joined: Sat Mar 18, 2006 5:04 pm
Very interesting...

So if we don't get AC with one prime, we change it... in the we will hit the right prime number which would give us AC:)))

Emilio
Experienced poster
Posts: 163
Joined: Sun Oct 17, 2004 8:31 pm
Location: Murcia, Spain
David wrote:
This can be computed in O(n*m) by first storing the hash values of all X*1 patterns, and then computing the hash of an X*Y pattern based on the hash of the X*Y pattern just above it, along with the hash of the first line.
If your hash function depends of i and j, how do you compute your X*1 patterns in a constant time? I think for each column you must compute the first whole X*1 pattern and for the next ones in the same column in a constant time. Am I right? And if the answer is yes, how do you compute this?

Thanks, in advance.
Last edited by Emilio on Sat Apr 22, 2006 2:39 pm, edited 1 time in total.

david
Learning poster
Posts: 83
Joined: Mon Apr 21, 2003 10:14 pm
Emilio wrote: If your hash function depends of i and j, how do you compute yout X*1 patterns in a constant time? I think for each column you must compute the first whole X*1 pattern and for the next ones in the same column in a constant time. Am I right? And if the answer is yes, how do you compute this?

Thanks, in advance.
That's right. For each row, I first compute the hash of the first 1

andresw1
New poster
Posts: 27
Joined: Sat Mar 18, 2006 5:04 pm
Hello,

David,
What do you mean by saying this is slow? Do you use double hashing or something more complex that makes it slow? I can imagine how it can bi done faster...

misof
A great helper
Posts: 430
Joined: Wed Jun 09, 2004 1:31 pm
andresw1 wrote:Hello,

David,
What do you mean by saying this is slow? Do you use double hashing or something more complex that makes it slow? I can imagine how it can bi done faster...
The modulo operation is slow. If you use a function resembling the one Cho described earlier, you can get the "modulo 2^32" for free -- you just have to compute everything in unsigned ints.

Aaaah, there's nothing like showing off some good old tricks of the trade

andresw1
New poster
Posts: 27
Joined: Sat Mar 18, 2006 5:04 pm
misof,
As far as I know good hash is the one which uses nice primes... 2^32 is not one of them as far as I know. In addition Cho mentioned that you can use two different hashings and make a pair. This way you minimise the chance of wrong answer. Is it smart to use int and unsigned int as pair of mods?

Cho
A great helper
Posts: 274
Joined: Wed Oct 20, 2004 11:51 pm
Location: Hong Kong
andresw1 wrote:As far as I know good hash is the one which uses nice primes... 2^32 is not one of them as far as I know
The hash value (for one dimensional string) introduced by some text book is:
Summation over all k, c[k]*2^k mod p, where p is a prime.
But mine is:
Summation over all k, c[k]*p^k mod (2^32), where p is an odd number.
Both should work, I chose mine because I think (and indeed) it should be faster.
andresw1 wrote:In addition Cho mentioned that you can use two different hashings and make a pair.
It's david's idea, not mine.

aminallam
New poster
Posts: 39
Joined: Thu Dec 08, 2005 12:51 pm
Location: Suez, Egypt.
[quote]That's right. For each row, I first compute the hash of the first 1

Emilio
Experienced poster
Posts: 163
Joined: Sun Oct 17, 2004 8:31 pm
Location: Murcia, Spain
Hi, I got AC.
I would must have read something about Karp-Rabin before asking anything.
Thanks anyway.

animallan you would must follow the same way than me, if you read something about Karp-Rabin you will understand it without troubles. In this moment I haven't the links, but google ...
Good luck!

aminallam
New poster
Posts: 39
Joined: Thu Dec 08, 2005 12:51 pm
Location: Suez, Egypt.
Thank you very much Mr. Emilio. I followed your advice and got AC.

For Mr. Misof:
I do not know how to use the trick of using unsigned int, and not using %. For example, consider this line:
hm2[j]=((hm2[i-1][j]+MOD-(hm1[i-x][j]*po[x-1])%MOD)*AS+hm1[j])%MOD;
How can I convert it using this trick? I have tried removing %MOD, and +MOD, but I got WA.