首页 > 代码库 > 正则表达式-2-正则表达式实战1
正则表达式-2-正则表达式实战1
正则表达式:
简单地说,正则表达式就是一套处理字符串的规则和方法,以行为单位对字符串进行处理,通过特殊的符号的辅助,我们可以快速的过滤,替换某些特定的字符串。
运维工作中,会有大量访问日志,错误日志,大数据。如何能够快速的过滤出我们需要的内容,这就需要正则表达式。
awk,sed,grep(egrep) 三剑客要想能工作的更高效,那一定离不开正则表达式的配合的。
我们要想玩好三剑客,首先就要掌握正则表达式。
Linux里的正则表达式,主要是awk,sed,grep(egrep)三剑客的正则表达式。
基础正则表达式:即BRE
正则表达式实际就是一些特殊字符,赋予了它特定的含义。
1)^word 搜索以word开头的。
2)Word$ 搜索以word结尾的。
例子:
文件oldboy.log内容:
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
过滤以I开头的内容:
[root@weibochoutu_1 test]# grep "^I" oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
过滤以M开头的内容:
[root@weibochoutu_1 test]# grep "^M" oldboy.log
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
-i:不区分大小写
[root@weibochoutu_1 test]# grep -i "^M" oldboy.log
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
过滤以M结尾的内容:
[root@weibochoutu_1 test]# grep -i "M$" oldboy.log
My blog is http://oldboy.blog.51cto.com
3). 代表且只能代表任意一个字符。
例子1:
把含有blog的内容显示出来:
[root@weibochoutu_1 test]# grep "bl.g" oldboy.log
My blog is http://oldboy.blog.51cto.com
例子2:
[root@weibochoutu_1 test]# echo "is blog not boog" >> oldboy.log
[root@weibochoutu_1 test]# echo "not boog" >> oldboy.log
[root@weibochoutu_1 test]# cat oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
[root@weibochoutu_1 test]# grep "b.og" oldboy.log
My blog is http://oldboy.blog.51cto.com
is blog not boog
not boog
4)\ 转义符号,让有着特殊身份意义的字符,脱掉马甲,还原原型。
例子:\.
5)* 重复0个或多个前面的一个字符。
例子:O* 可以表示啥也没有或者无限个O
例子:
[root@weibochoutu_1 test]# cat oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
[root@weibochoutu_1 test]# grep "490*448" oldboy.log
My qq is 49000448
[root@weibochoutu_1 test]# cat oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
4900000448,49448
[root@weibochoutu_1 test]# grep "490*448" oldboy.log
My qq is 49000448
4900000448,49448
6).* 匹配所有字符。 ^.* 以任意多个字符开头的。
例子:
[root@weibochoutu_1 test]# cat oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
4900000448,49448
[root@weibochoutu_1 test]# cat oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
4900000448,49448
[root@weibochoutu_1 test]# grep ".*" oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
4900000448,49448
7)[ ] 字符集合的重复特殊字符的符号。
例子:
[root@weibochoutu_1 test]# cat oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
4900000448,49448
[root@weibochoutu_1 test]# grep "b[lo]og" oldboy.log
My blog is http://oldboy.blog.51cto.com
is blog not boog
not boog
8)[ ^ ] 匹配不包含^后的任意字符的内容。
例子:
[ ^word ]:匹配不包含word任意字符的内容。
[root@localhost ~]# grep "[^qq]" oldboy.log --color
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
4900000448,49448
9)a\{n,m\} 重复n到m次,前一个重复的字符。(如果用egrep可以去掉斜线)
例子:a\{n,m\}:重复a,n到m 次。
a\{,m\} 重复a至多m次,前一个重复的字符。(centos6已不支持此用法)
\{n,\} 重复至少n次,前一个重复的字符。(如果用egrep可以去掉斜线)
\{n\} 重复n次,前一个重复的字符。(如果用egrep可以去掉斜线)
例子:
重复0,2到3次:
[root@localhost ~]# cat oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
4900000448,49448
[root@localhost ~]# grep "490\{2,3\}" oldboy.log --color
My qq is 49000448
4900000448,49448
例子:
[root@localhost ~]# cat oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
4900000448,49448
[root@localhost ~]# grep "490\{2,\}" oldboy.log --color 匹配0至少2次
My qq is 49000448
4900000448,49448
例子:
[root@localhost ~]# cat oldboy.log
I am oldboy linux teacher.
I like chinese chess,table tennis.
My blog is http://oldboy.blog.51cto.com
My qq is 49000448
my god,my name is not oldbey,but OLDBOY.
is blog not boog
not boog
4900000448,49448
[root@localhost ~]# grep "490\{2,\}448" oldboy.log --color 匹配0至少2次,最后面匹配448.
My qq is 49000448
4900000448,49448
拓展的正则表达式:即ERE(几乎用不到)研究暂时略过!
Linux里去哪里查找正则表达式的帮助呢?
[root@localhost ~]# man grep
举个例子
抓取IP地址:
法1:[root@localhost ~]# ifconfig eth1|grep "inet addr"|cut -d ":" -f2|cut -d " " -f1
192.168.1.106
法2:
[root@localhost ~]# ifconfig eth1|grep "inet addr"|awk -F ":" ‘{print $2}‘|awk ‘{print $1}‘
192.168.1.106
法3:[root@localhost ~]# ifconfig eth1|sed -n ‘2p‘|awk -F ":" ‘{print $2}‘|awk ‘{print $1}‘
192.168.1.106
法4:[root@localhost ~]# ifconfig eth1|awk -F "[: ]+" ‘NR==2 {print $4}‘
192.168.1.106
法5(sed实现):[root@localhost ~]# ifconfig eth1|sed -n ‘/inet addr/p‘|sed ‘s#^.*addr:##g‘|sed ‘s# B.*##g‘
192.168.1.106
法6(sed一条命令实现):
[root@localhost ~]# ifconfig eth1|sed -n ‘s#^.*addr:\(.*\) Bcast.*$#\1#gp‘
192.168.1.106
Sed(正则)匹配技巧如下:
个人觉着类似的方法还有很多,需要多加实践练习才是硬道理啊!
Sed,awk重点,要学好!
正则表达式-2-正则表达式实战1