Select Page

Complete all the problems in the attached PDF. You may do this on paper or in a Word or other document
hw_14_text_compression_1.pdf
Unformatted Attachment Preview
Individual Homework: Text Compression with Huffman Codes
In class we discussed two different forms of text compression: Lempel-Ziv and Huffman codes.
Try out Huffman codes yourself to make sure you’ve got the concept.
Text compression
Consider the following text (69 bytes):
HOW MUCH WOOD WOULD A WOODCHUCK CHUCK IF A WOODCHUCK
COULD CHUCK WOOD
1.
Compress this using Huffman encoding. I will give you the Huffman tree (below). You
provide each of the following:
a. Dictionary (a table of letters and codes)
b. Dictionary size (in bytes: 8 bits per letter, plus the number of bits in the code, all
added up and then divided by 8 to get bytes)
c. Encoded text (a sequence of bits)
d. Encoded text size (in bytes: count the bits and divide by 8)
e. Total compressed size (dictionary + encoded text)
f. Space savings (1 – compressed/original)
69
0
0
SPACE
12
00
1
U
26
43
1
0
14
20
0
7
C
1
0
I
1
010010
10
1
0
A
K
1
1
010011
4
1000
1
2
01010
L
0
O
10
0
4
F
0
1
3
2
01000
23
101
0
M
1
1
7
011
0
1
2
01011
1
11
12
110
1
D
0
6
1001
H
6
1110
1
W
6
1111
Huffman tree
Now consider the following text:
HALLOWEEN
2. Build a frequency table for this text.
3. Using the frequency table you just built, build a Huffman tree. Remember you always
combine the two lowest-frequency nodes currently available.
Next, I want you to consider the frequency table to the right. I generated it from a
real piece of text (The Wonderful Wizard of Oz by L. Frank Baum), so it more or less
reflects actual English usage of letters and spaces. (Certain letters, like Y and O, are
abnormally elevated because they appear in “Dorothy” over and over in the text.)
4. Build a Huffman tree using these frequencies. You can work with a partner
on this one if you want. Remember you always combine the two lowestfrequency nodes currently available.
Huffman codes
5. Using the Huffman tree you built in #4 above, find the Huffman code for
each letter of the alphabet (and space).
6. Now using your codes, encode the following text:
THE KANGAROO CAN JUMP INCREDIBLE
Q
X
J
Z
V
K
P
B
G
F
M
C
Y
U
W
L
D
S
I
R
N
H
A
O
T
E
space
1
1
1
1
6
10
12
12
18
18
19
21
21
23
26
36
44
48
51
52
54
60
69
71
82
111
200