今天是学习java的第29天,今天学习的是Huffman的建树。
2. Description首先,我先对我理解的赫夫曼树做一个介绍吧。
赫夫曼树,又称最优树,是带权路径最短的树。
树的带权路径是树中所有叶子节点的带权路径长度之和,通常记为WPL。
而赫夫曼树,则是由这些节点构造的树中带权路径最短的一条。
具体的,可以看看这位博主的介绍:http://c.biancheng.net/view/3398.html
我就直接说赫夫曼树怎么创建吧
2.1 Huffman Tree 创建有以上ABCD四个节点,其权值分别为1,2,3,4.
选择其中权值最小的两个,这里就是A和B,将其组合起来
这样,就得到了一个权值为上面两个之和的节点,再将这个整体返还到原节点库中,重复操作
最后就得到下面的这个树
此时WPL = 4 * 1 + 3 * 2 + 2 * 3 + 1 * 3 = 19,带权路径最小
这个树也就是赫夫曼树
3. Codepackage datastructure;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.stream.Collectors;
import java.util.Arrays;
public class Huffman {
class HuffmanNode {
// The char.
char Character;
// Weight.
int Weight;
// The left child.
HuffmanNode LeftChild;
// The right child.
HuffmanNode RightChild;
// The parent.
HuffmanNode Parent;
public HuffmanNode(char paraCharacter, int paraWeight, HuffmanNode paraLeftChild, HuffmanNode paraRightChild, HuffmanNode paraParent) {
Character = paraCharacter;
Weight = paraWeight;
LeftChild = paraLeftChild;
RightChild = paraRightChild;
Parent = paraParent;
} // Of HuffmanNode
public String toString() {
String resultString = "(" + Character + ", " + Weight + ")";
return resultString;
} // Of toString
} // Of HuffmanNode
public static final int NUM_CHARS = 256;
String inputText;
int alphabetLength;
char[] alphabet;
// 字符数。长度包括非叶子节点,长度为2*alphabetLength.
int[] charCounts;
// 字符到字母表中的映射。
int[] charMapping;
// 字母表中每个字符的代码。
String[] huffmanCodes;
HuffmanNode[] nodes;
public Huffman(String paraFilename) {
charMapping = new int[NUM_CHARS];
readText(paraFilename);
}// Of the first constructor
public void readText(String paraFilename) {
try {
inputText = Files.newBufferedReader(Paths.get(paraFilename), StandardCharsets.UTF_8)
.lines().collect(Collectors.joining("n"));
} catch (Exception ee) {
System.out.println(ee);
System.exit(0);
} // Of try
System.out.println("The text is:rn" + inputText);
}// Of readText
public void constructAlphabet() {
// 所有字符到字符表中的映射被初始化为-1.
Arrays.fill(charMapping, -1);
// 字符数目统计,最多为NUM_CHARS个.
int[] tempCharCounts = new int[NUM_CHARS];
// 由下文可知,求出Char的ASCII索引.
int tempCharIndex;
// Step 1.
char tempChar;
for (int i = 0; i < inputText.length(); i++) {
tempChar = inputText.charAt(i);
tempCharIndex = (int) tempChar;
System.out.print("" + tempCharIndex + " ");
tempCharCounts[tempCharIndex]++; // 这里类似于哈希表的形式,统计索引为tempCharIndex的字符的出现次数。
} // Of for i
// Step 2. 统计出现了多少种字符.
alphabetLength = 0;
for (int i = 0; i < 255; i++) {
if (tempCharCounts[i] > 0) {
alphabetLength++;
} // Of if
} // Of for i
// Step 3.
alphabet = new char[alphabetLength];
charCounts = new int[2 * alphabetLength - 1];
int tempCounter = 0;
for (int i = 0; i < NUM_CHARS; i++) {
if (tempCharCounts[i] > 0) {
alphabet[tempCounter] = (char) i;
charCounts[tempCounter] = tempCharCounts[i];
charMapping[i] = tempCounter;
tempCounter++;
} // Of if
} // Of for i
System.out.println("The alphabet is: " + Arrays.toString(alphabet));
System.out.println("Their counts are: " + Arrays.toString(charCounts));
System.out.println("The char mappings are: " + Arrays.toString(charMapping));
}// Of constructAlphabet
public void constructTree() {
// Step 1. 为节点分配空间
nodes = new HuffmanNode[alphabetLength * 2 - 1];
boolean[] tempProcessed = new boolean[alphabetLength * 2 - 1];
// Step 2. 初始化叶子节点.
for (int i = 0; i < alphabetLength; i++) {
nodes[i] = new HuffmanNode(alphabet[i], charCounts[i], null, null, null);
} // Of for i
// Step 3. 创建树.
int tempLeft, tempRight, tempMinimal;
for (int i = alphabetLength; i < 2 * alphabetLength - 1; i++) {
// Step 3.1 选择第一个最小值作为左孩子.
tempLeft = -1;
tempMinimal = Integer.MAX_VALUE;
for (int j = 0; j < i; j++) {
if (tempProcessed[j]) {
continue;
} // Of if
if (tempMinimal > charCounts[j]) {
tempMinimal = charCounts[j];
tempLeft = j;
} // Of if
} // Of for j
tempProcessed[tempLeft] = true;
// Step 3.2 选择第二个小的值作为右孩子.
tempRight = -1;
tempMinimal = Integer.MAX_VALUE;
for (int j = 0; j < i; j++) {
if (tempProcessed[j]) {
continue;
} // Of if
if (tempMinimal > charCounts[j]) {
tempMinimal = charCounts[j];
tempRight = j;
} // Of if
} // Of for j
tempProcessed[tempRight] = true;
System.out.println("Selecting " + tempLeft + " and " + tempRight);
// Step 3.3 创建一个新节点
charCounts[i] = charCounts[tempLeft] + charCounts[tempRight];
nodes[i] = new HuffmanNode('*', charCounts[i], nodes[tempLeft], nodes[tempRight], null);
// Step 3.4 链接孩子节点.
nodes[tempLeft].Parent = nodes[i];
nodes[tempRight].Parent = nodes[i];
System.out.println("The children of " + i + " are " + tempLeft + " and " + tempRight);
} // Of for i
}// Of constructTree
public HuffmanNode getRoot() {
return nodes[nodes.length - 1];
}// Of getRoot
public static void main(String[] args) {
Huffman tempHuffman = new Huffman("D:/java/vstest/src/datastructure/Huffman.txt");
tempHuffman.constructAlphabet();
tempHuffman.constructTree();
HuffmanNode tempRoot = tempHuffman.getRoot();
System.out.println("The root is: " + tempRoot);
} // Of main
}
运行结果
4. Summarize晚上时间有限,我虽然成功的将代码复现了出来,但是其中细枝末节仍然感觉一知半解,就感觉没有把整个代码串起来,还得多琢磨琢磨。



