📜  霍夫曼解码

📅  最后修改于: 2021-10-26 05:54:30             🧑  作者: Mango



Frequencies : A: 6, B: 1, C: 6, D: 2, E: 5
Encoded Data : 
Huffman Tree: '#' is the special character used
              for internal nodes as character field
              is not needed for internal nodes. 
             /       \
        #(12)         #(8)
     /      \        /     \
    A(6)     C(6) E(5)     #(3)
                         /     \
                       B(1)    D(2)  
Code of 'A' is '00', code of 'C' is '01', ..

Input Data : GeeksforGeeks
Character With there Frequencies
e 10, f 1100, g 011, k 00, o 010, r 1101, s 111
Encoded Huffman data :
Decoded Huffman Data


  1. 我们从根开始并执行以下操作,直到找到叶子为止。
  2. 如果当前位为 0,我们移动到树的左节点。
  3. 如果该位为 1,我们将移动到树的右节点。
  4. 如果在遍历过程中遇到叶节点,我们打印该特定叶节点的字符,然后再次从步骤 1 开始继续对编码数据进行迭代。

下面的代码将一个字符串作为输入,对其进行编码并保存在一个变量 encodingString 中。然后它解码它并打印原始字符串。


// C++ program to encode and decode a string using
// Huffman Coding.
#define MAX_TREE_HT 256
using namespace std;
// to map each character its huffman value
map codes;
// to store the frequency of character of the input data
map freq;
// A Huffman tree node
struct MinHeapNode
    char data;             // One of the input characters
    int freq;             // Frequency of the character
    MinHeapNode *left, *right; // Left and right child
    MinHeapNode(char data, int freq)
        left = right = NULL;
        this->data = data;
        this->freq = freq;
// utility function for the priority queue
struct compare
    bool operator()(MinHeapNode* l, MinHeapNode* r)
        return (l->freq > r->freq);
// utility function to print characters along with
// there huffman value
void printCodes(struct MinHeapNode* root, string str)
    if (!root)
    if (root->data != '$')
        cout << root->data << ": " << str << "\n";
    printCodes(root->left, str + "0");
    printCodes(root->right, str + "1");
// utility function to store characters along with
// there huffman value in a hash table, here we
// have C++ STL map
void storeCodes(struct MinHeapNode* root, string str)
    if (root==NULL)
    if (root->data != '$')
    storeCodes(root->left, str + "0");
    storeCodes(root->right, str + "1");
// STL priority queue to store heap tree, with respect
// to their heap root node value
priority_queue, compare> minHeap;
// function to build the Huffman tree and store it
// in minHeap
void HuffmanCodes(int size)
    struct MinHeapNode *left, *right, *top;
    for (map::iterator v=freq.begin(); v!=freq.end(); v++)
        minHeap.push(new MinHeapNode(v->first, v->second));
    while (minHeap.size() != 1)
        left = minHeap.top();
        right = minHeap.top();
        top = new MinHeapNode('$', left->freq + right->freq);
        top->left = left;
        top->right = right;
    storeCodes(minHeap.top(), "");
// utility function to store map each character with its
// frequency in input string
void calcFreq(string str, int n)
    for (int i=0; iright
// if s[i]=='0' then move to node->left
// if leaf node append the node->data to our output string
string decode_file(struct MinHeapNode* root, string s)
    string ans = "";
    struct MinHeapNode* curr = root;
    for (int i=0;ileft;
           curr = curr->right;
        // reached leaf node
        if (curr->left==NULL and curr->right==NULL)
            ans += curr->data;
            curr = root;
    // cout<first <<' ' << v->second << endl;
    for (auto i: str)
    cout << "\nEncoded Huffman data:\n" << encodedString << endl;
    decodedString = decode_file(minHeap.top(), encodedString);
    cout << "\nDecoded Huffman Data:\n" << decodedString << endl;
    return 0;


Character With there Frequencies
e 10
f 1100
g 011
k 00
o 010
r 1101
s 111

Encoded Huffman data 

Decoded Huffman Data

比较输入文件大小和霍夫曼编码的输出文件。我们可以通过简单的方式计算输出数据的大小。假设我们的输入是一个字符串“geeksforgeeks”并存储在一个文件 input.txt 中。

Input: "geeksforgeeks"
Total number of character i.e. input length: 13
Size: 13 character occurrences * 8 bits = 104 bits or 13 bytes.


Input: "geeksforgeeks"
Character |  Frequency |  Binary Huffman Value |
   e      |      4     |         10            |
   f      |      1     |         1100          |   
   g      |      2     |         011           |
   k      |      2     |         00            |
   o      |      1     |         010           |
   r      |      1     |         1101          |
   s      |      2     |         111           |

So to calculate output size:
e: 4 occurrences * 2 bits = 8 bits
f: 1 occurrence  * 4 bits = 4 bits
g: 2 occurrences * 3 bits = 6 bits
k: 2 occurrences * 2 bits = 4 bits
o: 1 occurrence  * 3 bits = 3 bits
r: 1 occurrence  * 4 bits = 4 bits
s: 2 occurrences * 3 bits = 6 bits

Total Sum: 35 bits approx 5 bytes


如果您希望与专家一起参加现场课程,请参阅DSA 现场工作专业课程学生竞争性编程现场课程