码迷,mamicode.com
首页 > 其他好文 > 详细

第12章 正则表达式与文件格式化处理

时间:2016-07-17 21:10:48      阅读:290      评论:0      收藏:0      [点我收藏+]

标签:

基础正则表达式

语系对正则表达式的影响

不同语系下,字符的编码数据可能不同。

LANG=C:012……ABC……abc……

LANG=zh_CN:012……aAbB……

因此,使用[A-Z]时,搜索到的字符也不一样。

特殊符号 代表意义
[:alnum:] 大小写字符及数字,0-9,A-Z,a-z
[:alpha:] 英文大小写字符
[:blank:] 空格键与tab键
[:cntrl:] 控制按键,CR,LF,TAB,DEL等
[:digit:] 代表数字
[:graph:] 除空格符(空格和Tab)外其他按键
[:lower:] 小写字符
[:print:] 可以被打印出来的字符
[:punct:] 标点字符," ‘ ? ; : # $
[:upper:] 大写字符
[:space:] 任何会产生空白的字符
[:xdigit:] 十六进制数字

grep的一些高级参数

除了上一章介绍的基本用法,grep还有一些高级用法。

grep [-A] [-B] [--color=auto} ‘搜寻字符串‘ filename

参数:

-A:后面可加数字n,为after的意思,除了列出该列,后面的n列也列出来

-B:后面可加数字n,为after的意思,除了列出该列,前面的n列也列出来

--color=auto:对正确选取的数据着色

//-n用于显示行号
[root@localhost 桌面]# dmesg | grep -n --color=auto eth
1730:[   10.210383] e1000 0000:02:01.0 eth0: (PCI:66MHz:32-bit) 00:0c:29:7f:dd:91
1731:[   10.210404] e1000 0000:02:01.0 eth0: Intel(R) PRO/1000 Network Connection

注:grep搜索到字符串后都是以整行为单位显示。

 

基础正则表达式练习

以下是练习文本

[root@localhost 桌面]# cat regular_express.txt
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesnt fit me.
However, this dress is about $ 3183 dollars.
GNU is free air not free beer.
Her hair is very beauty.
I cant finish the test.
Oh! The soup taste good.
motorcycle is cheap than car.
This window is clear.
the symbol * is represented as start.
Oh!    My god!
The gd software is a library for drafting programs.
You are the best is mean you are the no. 1.
The world <Happy> is the same with "glad".
I like dog.
google is the best tools for search keyword.
goooooogle yes!
go! go! Lets go.
# I am VBird

[root@localhost 桌面]# 

例题一:查找特定字符串

//查找含有the的行
[root@localhost 桌面]# grep -n the regular_express.txt
8:I cant finish the test.
12:the symbol * is represented as start.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
18:google is the best tools for search keyword.

//查找不含有the的行
[root@localhost 桌面]# grep -vn the regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesnt fit me.
5:However, this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
7:Her hair is very beauty.
9:Oh! The soup taste good.
10:motorcycle is cheap than car.
11:This window is clear.
13:Oh!    My god!
14:The gd software is a library for drafting programs.
17:I like dog.
19:goooooogle yes!
20:go! go! Lets go.
21:# I am VBird
22:
[root@localhost 桌面]# 

例题二:利用中括号[]来查找集合字符

//查找tast或test字符串
[root@localhost 桌面]# grep -n t[ae]st regular_express.txt
8:I cant finish the test.
9:Oh! The soup taste good.

//查找不是以g开头的oo字符串
[root@localhost 桌面]# grep -n [^g]oo regular_express.txt
2:apple is my favorite food.
3:Football game is not use feet only.
18:google is the best tools for search keyword.
19:goooooogle yes!

//查找数字
[root@localhost 桌面]# grep -n [0-9] regular_express.txt
5:However, this dress is about $ 3183 dollars.
15:You are the best is mean you are the no. 1.

查找不是以小写字母开头的oo字符串
[root@localhost 桌面]# grep -n [^[:lower:]]oo regular_express.txt
3:Football game is not use feet only.
[root@localhost 桌面]# 

例题三:行首与行尾字符^$

//以the开头的行
[root@localhost 桌面]# grep -n ^the regular_express.txt
12:the symbol * is represented as start.

//以小写字母开头的行
[root@localhost 桌面]# grep -n ^[a-z] regular_express.txt
2:apple is my favorite food.
4:this dress doesnt fit me.
10:motorcycle is cheap than car.
12:the symbol * is represented as start.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Lets go.

//以小数点结尾的(需要转义)
[root@localhost 桌面]# grep -n \.$ regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesnt fit me.
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol * is represented as start.
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
20:go! go! Lets go.

//查找空白行
[root@localhost 桌面]# grep -n ^$ regular_express.txt
22:
[root@localhost 桌面]# 

例题四:任意字符.和重复字符*

.(小数点):代表一定有一个任意字符的意思

*:代表重复前一个0到无穷的意思

//查找以g开头,d结尾,中间两个字符的字符
[root@localhost 桌面]# grep -n g..d regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
9:Oh! The soup taste good.
16:The world <Happy> is the same with "glad".

//查找至少含有两个o,后面跟0到无穷个o的字符
[root@localhost 桌面]# grep -n ooo* regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!
[root@localhost 桌面]# 

例题五:限定连续RE字符范围{}

{}必须转义

//查找o重复两次的字符
[root@localhost 桌面]# grep -n o\{2\} regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!

//查找o重复2到5次的字符
[root@localhost 桌面]# grep -n o\{2,5\} regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.
18:google is the best tools for search keyword.
19:goooooogle yes!

//查找o重复两次以上的
[root@localhost 桌面]# grep -n go\{2,\}g regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!
[root@localhost 桌面]# 

 基础正则表达式字符

经过上节的五个例题,可将基础的正则表达式总结如下:

RE字符 意义
^word 带查找的字符串在行首
word$ 待查找的字符串在行尾
. 代表一定有一个任意字符的字符
\ 转义字符
* 重复零到无穷多个前一个字符
[list] 从字符集合的RE字符里找到想要选取的字符
[n1-n2] 从字符集合的RE字符里找到想要选取的字符范围
[^list]
从字符集合的RE字符里找到不想要选取的字符范围
\{n,m\} 前一个字符重复n到m次

 

 

 

 

 

第12章 正则表达式与文件格式化处理

标签:

原文地址:http://www.cnblogs.com/wuchaodzxx/p/5678709.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!