283 - Compress

Cloud · Post by **Cloud** » Tue Jan 29, 2002 7:39 pm

give me some tricky data pls , i dont know why my program got WA

Cloud · Post by **Cloud** » Tue Jan 29, 2002 7:50 pm

I 've just got AC

Thanks for reading.

hlchan · Post by **hlchan** » Sat Jul 20, 2002 8:08 am

The test input contains charactors besides 'A'..'Z', 'a'...'z', '.', ' ', ',', '-', '$'.

So bad this question is!

cytse · Post by **cytse** » Tue Oct 22, 2002 6:51 am

Can anyone explain why the solution for sample input is 335 and 167?

Per · Post by **Per** » Fri Nov 29, 2002 11:44 pm

Doing only the second case as it's smaller, 167 is what you get when using, for instance, this encoding scheme:

[space] -> 000
b -> 001
e -> 010
h -> 011
i -> 100
o -> 101
t -> 110
a -> 111000
n -> 111001
q -> 111010
r -> 111011
s -> 111100
T -> 111101
u -> 111110
$ -> 1111110
. -> 11111110
, -> 11111111

And I think you'll find a hard time trying to do better using the encoding method of the problem.

Even though I've finally managed to get the right output for the sample data, I still get WA. Could anyone of the few people who are actually accepted on this one give the corresponding output for the following input:

2
3
$
$
$
4
ABCABC123321123ABC$
ABC
ABC
123$

rjhadley · Post by **rjhadley** » Thu Dec 05, 2002 6:05 am

Would someone mind explaining the method for determining the encoding? The question seems quite vague and ambiguous .e.g.

Suppose you want to give some characters a code length of 3.

Then in next paragraph:

You choose to give the first three characters a code with length 2 and the rest an equal length.

Based upon the statement of the problem, one would think a Huffman Encoding is appropriate, but if I am not mistaken, a Huffman Encoding gives a total length which is shorter than the "minimum" for the sample inputs. So what is the process for determining the "minimum" encoding that it asks for?

Thanks.

Adrian Kuegel · Post by **Adrian Kuegel** » Thu Dec 05, 2002 10:39 am

Per: The output for your input is:
3
86

The method is the following:
You can use all bit lengths between 1 and 8, and you can choose for example to use bit length 2 for 3 characters and bit length (2+4) for up to 16 characters.
I use backtracking with memoization to calculate the minimum.

Per · Post by **Per** » Thu Dec 05, 2002 10:40 pm

Ah, I got accepted. All strange test cases considered, it still came down to a silly array initialization bug.

I did something similar; a simple O(n^2) (which could triviallly be turned into O(n log n)) thing using dynamic programming (where n is the number of distinct characters in the text).

cytse · Post by **cytse** » Tue Dec 31, 2002 5:31 pm

Thanks all. I misunderstood the problem.

cplusplus · Post by **cplusplus** » Fri Feb 07, 2003 10:32 pm

as the question describe
the first line of the test file will be an number N
and there will be N test cases....
the frist line of every case is a number m,then followed by n lines of data
then here is my code, I just read the in the data without doing anything
..........
int N,nLine,length;
char line[255];
cin>>N;
while(N--)
{
cin>>nLine;
while(nLine--)
{
length=0;
while(length==0)
{
cin.getline(line,255);
length=strlen(line);
}
}
}

but it get "time limit exceeded"....about 30 times.....
why?? I have tried many different ways.....but it dosen't work...
Can anyone please help me??
Thank you very much

cytse · Post by **cytse** » Sat Feb 08, 2003 9:13 am

Some input lines are quite long. Your character array is not large enough to get the whole line.

cplusplus · Post by **cplusplus** » Mon Feb 10, 2003 3:35 pm

cytse wrote:Some input lines are quite long. Your character array is not large enough to get the whole line.

ya!!
Thank you very much!!!!!
That's it!!
I assume there are most 1024 chars per line,so.....>"<
I have got AC now,finally

Thank you very much!!

epsilon0 · Post by **epsilon0** » Sat May 31, 2003 2:48 pm

this problem is STUPID

1] the text is very unclear about the encoding scheme. so i assume it is an huffman encoding (= optimal bitlength) with an extra condition, that all codes must have 8 bit maximum. so i assumed it was a "modified huffman".

2] i'm sure the sample output is wrong if my assumption is not.

3] i exhibit a 155 bit encoding scheme for the second sample, demonstrating my point:

To be or not to be, that is the question.$

'T' (1 occur) -> 111000
'o' (5 occur) -> 011
' ' (9 occur) -> 00
'b' (2 occur) -> 1001
'e' (4 occur) -> 1000
'r' (1 occur) -> 111001
'n' (2 occur) -> 1010
't' (6 occur) -> 010
',' (1 occur) -> 111010
'h' (2 occur) -> 1011
'a' (1 occur) -> 111011
'i' (2 occur) -> 1100
's' (2 occur) -> 1101
'q' (1 occur) -> 111100
'u' (1 occur) -> 111101
'.' (1 occur) -> 111110
'$' (1 occur) -> 111111

total is 9 * 2 + 11 * 3 + 14 * 4 + 8 * 6
= 18 + 33 + 56 + 48
= 155 bits.

this is a correct huffman encoding and all bitlengths are between 1 and 8.

PS: please dont formalise over an eventual error in the above table, as i partially computed it by hand. the point is that such an encoding exists that make it 155 bits, using:

1 code of length 2 bit
2 codes of length 3 bit
6 codes of length 4 bit
8 codes of length 6 bit

of course there are lots of ways to then build an encoding scheme with the same bitlength of 155 bit (just allocate the codes differently, swap the codes of same length, etc...)

so WTF is wrong with that problem? HELP

Caesum · Post by **Caesum** » Wed Nov 12, 2003 9:02 pm

epsilon,

it is not huffmann. you can code so many characters, for example using 4 bits=16 characters or you can use *one* of those characters to define an extension=15 characters plus one extension code.

junbin · Post by **junbin** » Mon Jan 05, 2004 5:53 am

Per wrote: Even though I've finally managed to get the right output for the sample data, I still get WA. Could anyone of the few people who are actually accepted on this one give the corresponding output for the following input:

2
3
$
$
$
4
ABCABC123321123ABC$
ABC
ABC
123$

I don't think there's such output...
1) All lines must end with a $
2) numbers (1, 2, 3) are not valid characters.

Therefore, any answers you get for this is probably undefined... (my AC program gives a different answer than the one given below)

OJ Board

283 - Compress

Poor question!

283 - Compress

The input of 283,please help me,thank you