shell 统计词频脚本

时间：2015-09-16 12:22:06 阅读：590 评论：0 收藏：0 [点我收藏+]

标签：

#!/bin/bash
if [ $# -ne 1 ];
then
        echo "Usage:$0 filename";
        exit -1
fi

filename=$1
egrep -o "\b[[:alpha:]]+\b" $filename | awk ‘{count[$0]++}END{printf("%-14s%s\n","Word","Count");for(ind in count){printf("%-14s%d\n",ind,count[ind]);}}‘

　　这里注意两点

egrep 和grep的区别：egrep 支持的正则更全一点

The symbol \b
matches the empty string at the edge of a word 匹配一个单词边界的空字符串

\< \>

The symbols \< and \> respectively match the empty string at the beginning and end of a word. 匹配单词的开头或者结尾空串

%-14s - 表示左对齐 14 表示字符串宽度为14

[:alpha:] 表示正则匹配 相当于 a-z A-Z 详见：http://www.cnblogs.com/zhuyp1015/archive/2012/07/01/2572289.html

shell 统计词频脚本

标签：

原文地址：http://www.cnblogs.com/ggbond1988/p/4812527.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行