240 - Variable Radix Huffman Encoding

All about problems in Volume 2. If there is a thread about your problem, please use it. If not, create one with its number in the subject.

Moderator: Board moderators

Post Reply
soyoja
Experienced poster
Posts: 106
Joined: Sun Feb 17, 2002 2:00 am
Location: Seoul, South Korea
Contact:

240 - Variable Radix Huffman Encoding

Post by soyoja »

How can I implement huffman encoding & solve this problem?

I know about huffman encoding theoretically. But I encounter with some

confusing in real implementation...

If you experienced implemeting huffman encoding or solving this

problem , then please tell me about your know-how.

Thanks.
epsilon0
Experienced poster
Posts: 112
Joined: Tue Nov 12, 2002 11:15 pm
Location: Paris, France.

Post by epsilon0 »

hello,

i just solved this problem. (i got PE).

my implementation has the following structures:

an encoding tree that is stored as an array of fathers. each node in the tree has additional fields that include: letter, weigth, position relative to siblings. these informations will allow building the tree incrementally, and the position field will then allow to reconstruct the code associated with each leaf. (by climbing from leaf to root.)

a heap. read about heaps if you need. it is very helpful in this problem. the heap keeps a structure that allow you to fetch the "smallest" node at any given time. (it is always on top of the heap). functions to insert/delete nodes from heap are fast (log n). of course you just need to put a pointer (or index) in the heap. the actual nodes data doesn't need to be moved.

i hope this can help you.

i can give you my source code in C if you have trouble with it and are interested to see a way of solving this problem.

let me know if you get AC.
We never perform a computation ourselves, we just hitch a ride on the great Computation that is going on already. --Tomasso Toffoli
soyoja
Experienced poster
Posts: 106
Joined: Sun Feb 17, 2002 2:00 am
Location: Seoul, South Korea
Contact:

Post by soyoja »

Hello,

Thank you for your kindly answer.

If you can, please send to me your accepted source code.

My e - mail address is soyoja@nownuri.net.

I believe that it must very helpful to me.

Thank you & God bleesing you ~ : )
GreenPenInc
Learning poster
Posts: 53
Joined: Sat May 01, 2004 9:31 pm
Contact:

240 - Huffman - Segfaulting - asserts up the wazoo!

Post by GreenPenInc »

Hi all,

I'm having a bear of a time with problem 240. I'm pretty sure it works okay, but I keep getting segfaults whenever I submit. Here is my (ugly, messy) code -- anyone have any insights as to what could be causing me grief? On my computer (gcc 3.2.3) it works great. Here is my testing input file:

Code: Select all

2 5 5 10 20 25 40
2 5 4 2 2 1 1
3 7 20 5 8 5 12 6 9
4 6 10 23 18 25 9 12
10 26 4 28 28 17 18 18 3 81 785 28 198 39 19 284 200 29 3 3 2 29 478 209 28 17 28 394
2 26 4 28 28 17 18 18 3 81 785 28 198 39 19 284 200 29 3 3 2 29 478 209 28 17 28 394
3 2 21 12
3 4 5 7 8 15
0
and here is my source:

[cpp]
#include <iostream>
#include <string>
#include <sstream>
#include <stdio.h>
#include <assert.h>

#define ALPHAMAX 50
#define KIDMAX 10

using namespace std;

typedef struct lettergroup;

/* GLOBALS */
int r, n, t;
lettergroup *alpha[ALPHAMAX];
string spacer = "Set";
int set = 0;

typedef struct lettergroup
{
int freq;
bool flagged;
string huffcode;
lettergroup *c[KIDMAX], *papa;
/* Common init stuff (um, could I please chain constructors?! */
void init_stuff(void)
{
for (int j = 0; j < r; j++)
c[j] = 0;
papa = 0;
flagged = false;
huffcode = "";
}
/* Constructor for leaf nodes; i = frequency */
lettergroup(int i)
{
freq = i;
init_stuff();
}
/* Constructor for intermediate nodes */
lettergroup(void)
{
init_stuff();
freq = 0;
}
void flag(void)
{
if (c[0])
{
for (int i = 0; i < r; i++)
{
assert(c);
c->flag();
}
return;
}
flagged = true;
}
/* Finds the highest-up ancestor group of this group */
lettergroup *first_parent()
{
if (papa)
return papa->first_parent();
return this;
}
~lettergroup(void)
{
int i;
for (i = 0; i < KIDMAX; i++)
{
if (c)
delete c;
c = 0;
}
}
};

/* Reset the alphabet for a new case */
void reset(void)
{
if (alpha[0])
{
lettergroup *big_top = alpha[0]->first_parent();
delete big_top;
}
int i;
for (i = 0; i < ALPHAMAX; i++)
alpha = 0;
}

/* Huffman-encodes a subtree */
void populate(lettergroup *a, string code)
{
assert(a);
if (!a->c[0])
a->huffcode = code;
else
{
for (int i = 0; i < r; i++)
{
assert(a->c);
populate(a->c, code + (char)(i + '0'));
}
}
}

/* Does the Huffman Encoding for each letter */
void huffman(void)
{
lettergroup *top = alpha[0]->first_parent();
assert(top);
populate(top, "");
}

/* Groups the letters into a tree structure */
void treegen(void)
{
int pass_count = (t - r) / (r - 1) + 1;
assert((t - r) % (r - 1) == 0);
while (pass_count--)
{
assert(alpha[0]);
lettergroup *dad = alpha[0]->first_parent(), *newgroup;
int i, j, best = 0, lowest_score = dad->freq;
assert(lowest_score);
/* new pass, nothing has been used yet */
for (i = 0; i < t; i++)
{
assert(alpha);
alpha->flagged = false;
}
/* first pass, find the lowest */
for (i = 1; i < t; i++)
{
dad = alpha[i]->first_parent();
if (dad->freq < lowest_score)
{
lowest_score = dad->freq;
best = i;
}
}
/* make the new group */
dad = alpha[best]->first_parent();
assert(dad);
dad->flag(); // flag the whole group as used
newgroup = dad->papa = new lettergroup();
newgroup->c[0] = dad;
newgroup->freq = dad->freq;
/* now do all remaining passes and add them in-order to the group */
for (i = 1; i < r; i++)
{
for (j = 0; j < t; j++)
{
assert(alpha[j]);
if (!alpha[j]->flagged)
{
best = j;
dad = alpha[j]->first_parent();
assert(dad);
lowest_score = dad->freq;
break;
}
}
while (j < t)
{
assert(alpha[j]);
if (!alpha[j]->flagged)
{
dad = alpha[j]->first_parent();
assert(dad);
if (dad->freq < lowest_score)
{
lowest_score = dad->freq;
best = j;
}
}
j++;
}
dad = alpha[best]->first_parent();
assert(dad);
dad->flag();
dad->papa = newgroup;
newgroup->c[i] = dad;
newgroup->freq += dad->freq;
} //for
} //while
}

/* Average length of set (round to 2 dec. places) */
double avelen(void)
{
double dReturn = 0.0;
int tot_freq = 0;
for (int i = 0; i < n; i++)
{
assert(alpha[i]);
tot_freq += alpha[i]->freq;
dReturn += double(alpha[i]->freq * (alpha[i]->huffcode).length());
}
assert(tot_freq > 0);
return dReturn / tot_freq;
}

/* Output the results */
void output(void)
{
int i;
printf("%s %d; average length %4.2f\n", spacer.data(), set, avelen());
for (i = 0; i < n; i++)
cout << " " << char(i + 'A') << ": " << alpha[i]->huffcode << endl;
}

int main(int argc, char **argv)
{
cin >> r;
string line;
while (r > 0)
{
set++;
cin >> n;
t = n; // t is total number of symbols, including fictitious ones
assert(r > 1);
if ((n - 1) % (r - 1)) // if it won't work out evenly, add fake ones
t += r - 1 - ((n - r) % (r - 1));
assert(r <= 10 && r >= 2 && n >= 2 && n <= 26 && t >= n && t <= ALPHAMAX);
reset();
getline(cin, line);
istringstream iss(line);
int i = 0, q;
while (iss >> q)
{
assert(q <= 999 && q >= 1);
alpha[i++] = new lettergroup(q);
}
if (n < t)
while (i <= t)
alpha[i++] = new lettergroup(0);
treegen();
huffman();
output();
cin >> r;
spacer = "\nSet";
}
return 0;
}


[/cpp]
_-(GPI)-_

"Finally I have freed myself from the clutches of the garbage fairy!"
alexiago
New poster
Posts: 14
Joined: Thu Jan 24, 2008 6:34 pm

240 Variable Radix Huffman Encoding

Post by alexiago »

the priority queue in this problem has a little trick, if there's a tie between a letter node and a "combination letter" (this node is the sum between two other nodes) then the "combination letter" node wins.
if there's a tie between two combination letters then the first added node wins.
try the I/O below

INPUT:
2 23 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67
2 14 1 667 334 1 667 334 1 667 334 1 667 334 1 667
0

my accepted code OUTPUT:

Set 1; average length 4.27
A: 01001100
B: 01001101
C: 0100111
D: 010010
E: 101110
F: 101111
G: 01000
H: 01110
I: 01111
J: 10110
K: 11110
L: 11111
M: 0010
N: 0011
O: 0101
P: 0110
Q: 1000
R: 1001
S: 1010
T: 1100
U: 1101
V: 1110
W: 000

Set 2; average length 3.22
A: 0010110
B: 010
C: 0011
D: 0010111
E: 011
F: 1110
G: 001000
H: 100
I: 1111
J: 001001
K: 101
L: 000
M: 001010
N: 110
the_color_gray
New poster
Posts: 1
Joined: Sat Apr 20, 2002 11:57 pm

Re: 240 Variable Radix Huffman Encoding

Post by the_color_gray »

For the input:

Code: Select all

10 26 100 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
My WA attempt produces the output:

Code: Select all

Set 1; average length 1.14
    A: 9
    B: 72
    C: 73
    D: 74
    E: 75
    F: 76
    G: 77
    H: 78
    I: 79
    J: 80
    K: 81
    L: 82
    M: 83
    N: 84
    O: 85
    P: 86
    Q: 87
    R: 88
    S: 89
    T: 0
    U: 1
    V: 2
    W: 3
    X: 4
    Y: 5
    Z: 6
Please indicate if this is incorrect.
The most likely way for the world to be destroyed, most experts agree, is by accident. That's where we come in; we're computer professionals. We cause accidents. - Nathaniel Borenstein
dull_jester
New poster
Posts: 17
Joined: Fri Oct 21, 2016 12:58 pm
Location: NS, Canada

Re: 240 - Variable Radix Huffman Encoding

Post by dull_jester »

Yes, alexiago is right! Just got Accepted. Although this rule is not written very well in the problem description. Such a pity!
Post Reply

Return to “Volume 2 (200-299)”