码迷,mamicode.com
首页 > 其他好文 > 详细

04-树6. Huffman Codes (30)

时间:2015-07-25 09:27:06      阅读:283      评论:0      收藏:0      [点我收藏+]

标签:浙大pat   mooc   huffman树   优先队列   c语言实现   

04-树6. Huffman Codes (30)

时间限制
200 ms
内存限制
65536 kB
代码长度限制
8000 B
判题程序
Standard
作者
CHEN, Yue

In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters ‘a‘, ‘x‘, ‘u‘ and ‘z‘ are 4, 2, 1 and 1, respectively. We may either encode the symbols as {‘a‘=0, ‘x‘=10, ‘u‘=110, ‘z‘=111}, or in another way as {‘a‘=1, ‘x‘=01, ‘u‘=001, ‘z‘=000}, both compress the string into 14 bits. Another set of code can be given as {‘a‘=0, ‘x‘=11, ‘u‘=100, ‘z‘=101}, but {‘a‘=0, ‘x‘=01, ‘u‘=011, ‘z‘=001} is NOT correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.

Input Specification:

Each input file contains one test case. For each case, the first line gives an integer N (2 <= N <= 63), then followed by a line that contains all the N distinct characters and their frequencies in the following format:

c[1] f[1] c[2] f[2] ... c[N] f[N]

where c[i] is a character chosen from {‘0‘ - ‘9‘, ‘a‘ - ‘z‘, ‘A‘ - ‘Z‘, ‘_‘}, and f[i] is the frequency of c[i] and is an integer no more than 1000. The next line gives a positive integer M (<=1000), then followed by M student submissions. Each student submission consists of N lines, each in the format:

c[i] code[i]

where c[i] is the i-th character and code[i] is a string of ‘0‘s and ‘1‘s.

Output Specification:

For each test case, print in each line either “Yes” if the student’s submission is correct, or “No” if not.

Sample Input:
7
A 1 B 1 C 1 D 3 E 3 F 6 G 6
4
A 00000
B 00001
C 0001
D 001
E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 11
Sample Output:
Yes
Yes
No
No

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void percolateDown(int *heap, int parent) {	//将位置parent出元素下滤使其满足堆序性
	int temp = heap[parent];
	int child = 2 * parent;
	if (child + 1 <= heap[0] && heap[child + 1] < heap[child])
		++child;
	while (child <= heap[0] && heap[child] < temp) {
		heap[parent] = heap[child];
		parent = child;
		child = 2 * parent;
		if (child + 1 <= heap[0] && heap[child + 1] < heap[child])
			++child;
	}
	heap[parent] = temp;
}
void buildMinHeap(int *heap) {			//已满足结构性,调整元素位置使满足堆序性
	for (int i = heap[0] / 2; i > 0; --i)	//所有非叶节点根据堆序性下滤
		percolateDown(heap, i);
}
int deleteMin(int *heap) {				//删除并返回堆顶元素,同时调整结构维护堆序性
	int minElem = heap[1];
	heap[1] = heap[heap[0]--];			//将最后一个元素移到堆顶
	percolateDown(heap, 1);				//唯一不满足堆序性的堆顶元素下滤
	return minElem;
}
void insertMinHeap(int *heap, int weight) {			//堆中插入一个元素
	heap[++heap[0]] = weight;			//插入到堆末尾
	//将插入节点的祖先节点中比他大的节点依次下滤,过程等价于上滤(稍微多一点的开销)
	for (int i = heap[0] / 2; i > 0 && heap[i] > weight; i /= 2)
		percolateDown(heap, i);
}
int calWPL(int *freq) {
	int heap[100] = {};					//huffman树所用堆,0位置保存元素大小,1位置开始保存权值
	int size = 0;
	for (int i = 0; i < 256; ++i) {		//将所有权值放入空堆中,等待建堆(调整使有堆序性)
		if (freq[i]) {
			heap[++size] = freq[i];
		}
	}
	heap[0] = size;						//0位置保存堆中元素个数
	buildMinHeap(heap);					//建堆
	//模拟构建huffman树过程:每次从堆中弹出权值最小的两个子树进行合并,合并后的树(权为子树权和)再入堆;
	//而wpl值 = 两个子树的wpl值(权重在子树内的路径和) + 两个子树的权重和(权重又在父子节点间有1的路径长);
	//因此不需要构建huffman树,只需要保存上述两个数值,其中子树的权重和保存在堆中,wpl变量保存子树的wpl值,同时用于累加
	int wpl = 0;
	for (int i = 1; i < size; ++i) {
		int weight1 = deleteMin(heap);
		int weight2 = deleteMin(heap);
		wpl += weight1 + weight2;
		insertMinHeap(heap, weight1 + weight2);
	}
	return wpl;
}
int isPrefix(char *s1, char *s2) {		//判断两个字符串是否属于前缀码关系
	while (s1 && s2 && *s1 == *s2)		//循环至第一个不相同的字母或结束位置
		++s1, ++s2;
	if (*s1 == '\0' || *s2 == '\0')		//如果此时有一个到达结束位置,说明这个字符串一定是另一个字符串的前缀
		return 1;
	else
		return 0;
}
int hasPrefixCode(char s[][200], int n) {//判断n个字符串是否含有前缀码
	for (int i = 0; i < n; ++i)
		for (int j = i + 1; j < n; ++j)
			if (isPrefix(s[i], s[j]))	//两两做对比
				return 1;
	return 0;
}
int main() {
	freopen("test.txt", "r", stdin);
	int n;
	scanf("%d", &n);
	int freq[256] = {};
	for (int i = 0; i < n; ++i) {
		char ch;
		int num;
		getchar();
		scanf("%c%d", &ch, &num);
		freq[ch] = num;
	}
	int wpl = calWPL(freq);				//模拟构建huffman树过程计算WPL(加权路径长)
	int k;		//k个测试用例
	scanf("%d", &k);
	while (k--) {
		char ch[256];
		char s[256][200];
		int thisWPL = 0;
		for (int i = 0; i < n; ++i) {
			scanf("\n%c %s", &ch[i], s[i]);
			thisWPL += freq[ch[i]] * strlen(s[i]);		//根据编码累加加权路径长
		}
		if (thisWPL == wpl && !hasPrefixCode(s, n))		//同时满足加权路径长最短和不含前缀码,则满足huffman编码
			printf("Yes\n");
		else
			printf("No\n");
	}

	return 0;
}


题目链接:http://www.patest.cn/contests/mooc-ds/04-%E6%A0%916

版权声明:本文为博主原创文章,未经博主允许不得转载。

04-树6. Huffman Codes (30)

标签:浙大pat   mooc   huffman树   优先队列   c语言实现   

原文地址:http://blog.csdn.net/ice_camel/article/details/47053193

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!