Exact matching subimages

Let's talk about algorithms!

Moderator: Board moderators

Post Reply
alex[LSD]
Learning poster
Posts: 53
Joined: Sat Jan 24, 2004 9:22 pm
Location: Tomsk, Siberia, RUSSIA
Contact:

Exact matching subimages

Post by alex[LSD] »

Hi!
I have a practical task and I need some pointers on that. Imagine we have two raster images (let it be .BMP). Let's say they are big, about 1000x1000x32bit. I need to find their largest common subimage. And I am not interested in subimages smaller than, let's say 50x50.
I decided that computing checksums for small square areas could speed up the search. But still I am not sure that it'll work fast enough. All the info I found through google was about "best matching" techniques. But I'm interested in EXACT matchings.
Can you people help me?
The more contests are held, the happier I am.
alex[LSD]
Learning poster
Posts: 53
Joined: Sat Jan 24, 2004 9:22 pm
Location: Tomsk, Siberia, RUSSIA
Contact:

Post by alex[LSD] »

Probleb: Help the dumb Russian
Time Limit: ASAP

We have two matrices A and B with dimensions (W1,H1) and (W2,H2) respectively. You need to find six numbers x1,y1,x2,y2,w,h such that
A(i,j) = B(k,l) for each i=x1..x1+w-1
for each j=y1..y1+h-1
for each k=x2..x2+w-1
for each l=y2..y2+h-1,
such that the value w*h is maximized. To be short the task is to find the biggest common submatrix.

Limitations: numbers in matrices line in the range 0..2^24-1. W1,W2<=1024, H1,H2<=768.

P.S. let me know if it's still unclear 8-)
The more contests are held, the happier I am.
nnahas
New poster
Posts: 40
Joined: Mon Nov 01, 2004 7:56 pm

Post by nnahas »

OK the best I got is O(N^4). Here is how I do it:
I try all possible values for (x1-x2) and (y1-y2).
To check what is the maximum common submatrix with given values of
(x1-x2) and (y1-y2) , I shift the matrix M2 by (x1-x2) horizontally and (y1-y2) vertically , and I obtain a new matrix M2' . (shifting is done without wrap-up ,of course. the matrix just gets smaller). Then I obtain the matrix M1' from M1 which just consists of the entries of M1 that will betested for equality with M2'. I then get M3 = M1'-M2' . Now our problem is just to get the largest submatrix of 0's in M3 . This can be done in O(N^2) in the following way: you try all possible y coordinates for the bottom of the rectangle, and for each such coordinate you get the largest rectangle. this can be done in O(N) by using the algorithm described in this link (Problem H)
http://www.informatik.uni-ulm.de/acm/Lo ... judge.html


N^4 is too slow for your problem size . you'll have to scale your bitmap down by averaging blocks of pixels for it to work, unless you find a faster algorithm like N^2 or something.
Please tell me if you find a faster algo or find an error in this one.
nnahas
New poster
Posts: 40
Joined: Mon Nov 01, 2004 7:56 pm

Post by nnahas »

By N I meant the size of the N*N bitmap, not the number of input bits. if you consider the number of input bits as being the input size my algo is O(N^2) , not O(N^4).
alex[LSD]
Learning poster
Posts: 53
Joined: Sat Jan 24, 2004 9:22 pm
Location: Tomsk, Siberia, RUSSIA
Contact:

Post by alex[LSD] »

Thank you so much! I didn't find no errors in your algo. Besides, task H is qiute interesting by itself. Perhaps I ll have to look for good techniques of guessing values for (x2-x1) and (y2-y1)... I realize that I can't solve this problem strictly, but I could dig for a suboptimal solution...
Again, thanks! :D
The more contests are held, the happier I am.
nnahas
New poster
Posts: 40
Joined: Mon Nov 01, 2004 7:56 pm

Post by nnahas »

I think I got an O(N^3 lg(N) ) expected time algorithm . Here it is :

First note that we can check whether there is a common submatrix of size W*H in O(N^2) time, (W and H are given) ,by getting a hash value for each W*H submatrix in the first matrix , putting these hash values in a hash table and then checking for each W*H submatrix in the second matrix if there is an equal W*H submatrix in the first matrix. Computing a hash value a for W*H submatrix can be done in O(1) amortized time by using a 2 dimensional version of the method used in the Rabin-Karp string matching algorithm . so you can tell whether there is a W*H common submatrix in O(N^2).

Now we can try all possible values for W , and for each W , make a binary search on the maximum H. that would result in a O(N^3 lg(N)) algorithm.

That will still be too slow for N=1000. One way to have a faster algo is to combine this algo with your idea of guessing good values for(x1-x2) and (y1-y2) . If you know that the maximum common submatrix must have W>50 and H >50 , you can get all the equal 50 * 50 submatrix pairs , get the (x1-x2) value and (y1-y2) value for each pair , and then get the maximum submatrix for those values.
The idea of the algorithm is that we hope there will be very few equal 50*50 submatrices, for example if we assume a black and white image with the color of each pixel being independent ,and white prob = 0.5, the probability that a pair of 50*50 submatrices will be identical by chance is 2^(-2500) .But of course in reality the pixels are far from being independent, and large areas with the same color may cause very bad performance, but then you'll just have to try sizes larger than 50*50 , until you try a size that allows you to weed out all but a few (x1-x2),(y1-y2) pairs.
In case the pixels are distributed independently from each other within each matrix, (as I said this is a wildly unrealistic assumption) , you get a O(N^2) expected time algorithm , which is quite feasible for N=1000.
Post Reply

Return to “Algorithms”