Page 1 of 1

103 - bug in judge

Posted: Sat Dec 02, 2006 7:17 am
by kogorman
I know this is from the original problem set, but it appears
that there's a bug in the judge unfixed in all this time.
I suggest you take this judge offline until you have a chance
to fix it.

REASON: it is very discouraging to have a good program
rejected in one of the first problems you try. :(

I could not figure out why my solution was reported as WA,
and found a Forum thread where someone else has the same
problem. Check out
http://online-judge.uva.es/board/viewtopic.php?t=10932

There's a program given there that is scored Accepted, but
which has a bug. It gives the wrong answer on the input
5 2
41 595
291 836
350 602
483 548
537 624

But this program was graded Accepted. My own (and some
other people's) programs give the correct result, but are scored
Wrong Answer by the judge. I submitted the broken program
just to see if this was the case. I have edited it slightly to
keep my g++ compiler happy, and to remove a presentation
problem, but it is still broken and still got Accepted.

My own program failed in test ID 5174216
The broken program was accepted in ID 5174527

So I'm graded as having completed problem 103, but this is
wrong because:
-- that wasn't my code. I don't want credit for it.
-- the code is actually broken.

I can't be sure that my program would pass a corrected judge,
of course, because I don't know what is being judged as wrong.

Posted: Sat Dec 02, 2006 2:21 pm
by Carlos
Would you please try:

30 10
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
200 201 202 203 204 205 206 207 208 209
100 101 102 103 104 105 106 107 108 109
300 301 302 303 304 305 306 307 308 309
200 201 202 203 204 205 206 207 208 209
100 101 102 103 104 105 106 107 108 109
300 301 302 303 304 305 306 307 308 309
400 401 402 403 404 405 406 407 408 409
500 501 502 503 504 505 506 507 508 509
411 412 413 414 415 416 417 418 419 420
521 522 523 524 525 526 527 528 529 530
50 60 70 80 90 50 60 70 80 90
20 30 40 50 60 70 80 90 10 99
10 9 8 7 6 5 4 3 2 1
19 29 39 49 59 69 79 89 95 9
15 35 25 45 65 55 85 75 93 5
50 60 70 80 90 50 60 70 80 90
20 30 40 50 60 70 80 90 10 99
10 9 8 7 6 5 4 3 2 1
19 29 39 49 59 69 79 89 95 9
15 35 25 45 65 55 85 75 93 5

Your should-be-AC program's output is:
10
25 24 22 12 11 13 17 19 18 20

and judge's output is:
13
1 2 3 4 5 21 12 11 13 17 19 18 20

Please, as soon as you check it, would you post your answer? Thanks.

After we check judge's validity (and your should-be-AC submission rightness), we'll add some more test cases to the problem, so that no wrong submission gets AC by chance.

Posted: Sat Dec 02, 2006 6:36 pm
by Adrian Kuegel
The judge output for this test case seems to be correct. At least, it is obviously a valid stack of towers, and it is bigger than the output of the submitted program (which means this program is definitely wrong).

Agreed

Posted: Sat Dec 02, 2006 8:58 pm
by kogorman
I have run my solution against this test case, and get the
same answer you did, and verified that the judge's answer
is longer and is correct. Accordingly, the WA response to
my program was correct.

Moreover, I also ran the incorrect C++ program and found
it gave a correct, but different answer:

13
28 7 8 9 10 26 15 14 16 17 19 18 20

Accordingly, I agree that the best resolution is to add test
cases to the judge.

Thanks for the quick reply.

Posted: Sun Dec 03, 2006 1:39 pm
by Chinchilla
Hello, I'm also having a WA problem. My program did output

13
1 2 3 4 5 21 12 11 13 17 19 18 20

for that particular test case and I'm confident my solution outputs correct answers.

ID run
5177083

Posted: Sun Dec 17, 2006 4:23 pm
by Carlos
Your solution is not right, it fails for some test cases. Also, you have a presentation error. Please, fix them.

I don't want to publish those test cases, so if you want more information please mail me.

About increasing judge's input...this problem has 20k submissions, and rejudgeing it will take a great time for the machine. Since we have to rejudge a lot of problems due to PE, and due this is not an urgent matter, we'll delay it (not for long, I hope).

Posted: Tue Jan 09, 2007 4:09 pm
by Carlos
I haven't forgotten about this, but rejudging it would be very heavy foe the machine now. I think I'll finish until every PE problem is rejudged.

Btw...up!

Posted: Sun Apr 08, 2007 1:43 pm
by Carlos
up

Re: 103 - bug in judge

Posted: Sun May 18, 2008 8:27 pm
by Carlos
We've added some more datasets to ensure no wrong solution gets AC. We are rejudging every submission.