首页 > 代码库 > Huffman 压缩解压缩java实现

Huffman 压缩解压缩java实现

Huffman编码的原理这里就不说了,网上随处都是。这里来讲讲利用Huffman编码来进行压缩和解压缩的具体实现吧。本工程使用java实现。


编码

1. 流程图


2. 数据结构

CharacterWeight:记录字符值,以及其在待压缩文件中的权重。

Class{
char c; //字符值
int weight;  //在文件中权重
String code;  //其对应huffman编码
}

HuffmanNode:huffman树中的节点信息。

Class{
Int parent;  //父节点
Int lChild; //左儿子
Int rChild;  //右儿子
Int weight; //权重
}

3. 程序关键点

3.1 Huffman树的构建

Huffman树的变量:ArrayList<HuffmanNode> list;

创建流程图:

for(int i=0;i<list.size()-1;i++){
			//w1 : the first min weight w2: the second min weight 
			//i1 : the first min weight index, i2: the second min weight index
			int w1 = MAX_VALUE, w2=MAX_VALUE; 
			int i1 = 0, i2 = 0;
			// find the two node with the minimum weight
			for(int j=0;j<tree.size();j++){
				HuffmanNode node = tree.get(j);
				if(node.getWeight()< w1 && node.getParent()==-1){
					w2 = w1;
					w1 = node.getWeight();
					i2 = i1;
					i1 = j;
				}
				else if(node.getWeight()<w2 && node.getParent()==-1){
					w2 = node.getWeight();
					i2 = j;
				}
			}
			//set the two node to be the children of a new node, and add the new node to the tree
			HuffmanNode pNode = new HuffmanNode(w1+w2);
			pNode.setlChild(i1);
			pNode.setrChild(i2);
			tree.add(pNode);
			tree.get(i1).setParent(tree.indexOf(pNode));
			tree.get(i2).setParent(tree.indexOf(pNode));}

3.2   根据Huffman 树获得Huffman编码

从叶子节点开始网上遍历Huffman树,直到到达根节点,根据当前节点为其父节点的左儿子还是右儿子确定这一位值是0还是1。最后将依次获得的0,1字符串反转获得Huffman编码。

代码:

for(int i=0;i<list.size();i++){
			HuffmanNode node = tree.get(i);
			HuffmanNode pNode = tree.get(node.getParent());
			String code ="";
			while(true){
				if(pNode.getlChild()==tree.indexOf(node)){
					code = "0"+code;
				}
				else if(pNode.getrChild() == tree.indexOf(node)){
					code = "1"+code;
				}
				else {
					System.out.println("Tree Node Error!!!");
					return null;
				}
				node=pNode;
				if(node.getParent()!=-1)
					pNode=tree.get(node.getParent());
				else 
					break;
			}
			list.get(i).setCode(new String(code));
		}
3.3  文件头设计

字符总数

Int 四个字节

字符种类数

Short 两个字节

叶子节点

char字符  short 父节点 3个字节

非叶子节点

Short 左儿子 short 右儿子 short父节点 6字节

 

文件头长度(单位: byte)

l= 9n

其中n 为字符种类数。

3.4文件内容的编码和写入

	while((temp=reader.read())!=-1){ //!= EOF	
				// get the code from the code table
				String code = codeTable.get((char)temp);
				c++;
				if(c>=count/96){
					System.out.print("=");
					c=0;
				}
				try{
					StringBuilder codeString = new StringBuilder(code);
					outputStringBuffer.append(codeString);
					while(outputStringBuffer.length()>8){
						out.write(Short.parseShort(outputStringBuffer.substring(0, 8),2));
						outputStringBuffer.delete(0, 8);
					}
				} catch(Exception e){
					e.printStackTrace();
				}

			}

解码

1. 流程图


2.   数据结构

HuffmanNode:huffman树中的节点信息。

Class{

Int parent;  //父节点

Int lChild; //左儿子

Int rChild;  //右儿子

Int weight; //权重

Char c; //对应的字符值


3.    程序关键点

3.1   重建Huffman树。在文件头中存放的原本就是Huffman树的节点信息,所以重建Huffman树是比较简单的。

代码:

	in = new DataInputStream(new FileInputStream(file));
			count = in.readInt();
			charNum = in.readShort();
			nodeNum = 2*charNum -1;
			//rebuild the huffman tree
			for(int i=0;i<charNum;i++){
				HuffmanNode node = new HuffmanNode((char)in.readByte());
				int parent = in.readShort();
				node.setParent(parent);
				tree.add(node);
			}
			
			for(int i=charNum;i<nodeNum;i++){
				HuffmanNode node = new HuffmanNode(' ');
				int l = in.readShort();
				int r = in.readShort();
				int p = in.readShort();
				node.setlChild(l);
				node.setrChild(r);
				node.setParent(p);
				tree.add(node);
			}

3.2 解码

解码流程图

	while(true){
				while(buff.length()<32){
					temp = in.readInt();
					String codeString = Integer.toBinaryString(temp);
					while(codeString.length()<32){
						codeString='0'+codeString;
					}
					buff.append(codeString);
				}
				node = tree.get(tree.size()-1);
				dep = 0;
				while(!(node.getlChild()==-1&&node.getrChild()==-1)){
					if(dep>=buff.length()){
						System.out.println( "Buff overflow");
					}
					if(buff.charAt(dep)=='0'){
						node = tree.get(node.getlChild());
					}
					else if(buff.charAt(dep)=='1'){
						node = tree.get(node.getrChild());
					}
					else{
						System.out.println("Coding error");
					}
					dep++;
				}
				
				char c = node.getCH();
				num++;
				if(num>=n/99){
					System.out.print("=");
					num=0;
				}
				count++;
				if(count>=n){
					break;
				}
				charBuff+=c;
				if(charBuff.length()>256){
					writer.write(charBuff);
					charBuff="";
				}
				buff.delete(0, dep);
				
			}
			
		} catch(EOFException e){
			//just do nothing
		}
		catch(Exception e){
			e.printStackTrace();
		} finally{
			//there may be data released in the buff and charbuff, so we need to process them
			while(buff.length()>0){
				node = tree.get(tree.size()-1);
				dep = 0;
				while(!(node.getlChild()==-1&&node.getrChild()==-1)){
					if(dep>=buff.length()){
						break;
					}
					if(buff.charAt(dep)=='0'){
						node = tree.get(node.getlChild());
					}
					else if(buff.charAt(dep)=='1'){
						node = tree.get(node.getrChild());
					}
					else{
						System.out.println("Coding error");
						//return;
					}
					dep++;
				}
				char c = node.getCH();
				num++;
				if(num>=n/99){
					System.out.print("=");
					num=0;
				}
				count++;
				if(count>=n){
					break;
				}
				charBuff+=c;
				if(charBuff.length()>256){
					try {
						writer.write(charBuff);
					} catch (IOException e1) {
						// TODO Auto-generated catch block
						e1.printStackTrace();
					}
					charBuff="";
				}
				buff.delete(0, dep);
			}
			
			try {
				writer.write(charBuff);
				writer.close();
			} catch (IOException e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			} 
		}
		try{
			writer.close();
		} catch(IOException e){
			throw e;
		}

    完整工程就不放出了,以后再更新吧。