perl HTML::LinkExtor模块(1)

首页 > 代码库 > perl HTML::LinkExtor模块(1)

2024-09-08 05:49:59 212人阅读

 1 use LWP::Simple;
 2 use HTML::LinkExtor;
 3 
 4 $html = get("http://www.baidu.com");
 5 $link = HTML::LinkExtor->new(\&check);
 6 $link->parse($html);
 7 
 8 sub check{
 9     ($tag, %links) = @_;
10     print "$tag\n";
11     foreach $key(keys %links){
12         print "$key -> $links{$key}\n";
13     }
14 }
15 
16 #$tag为标签类型， 如a, link, img, script等
17 #%links为hash类型， 键为链接名，值为链接值
18 #比如对于a标签， links中的key为href, 值为href中的链接名
19 # link
20 # href -> /favicon.ico
21 # link
22 # href -> /content-search.xml
23 # link
24 # href -> //www.baidu.com/img/baidu.svg
25 # link
26 # href -> //s1.bdstatic.com
27 # link
28 # href -> //t1.baidu.com
29 # link
30 # href -> //t2.baidu.com
31 # link
32 # href -> //t3.baidu.com
33 # link
34 # href -> //t10.baidu.com
35 # link
36 # href -> //t11.baidu.com
37 # link
38 # href -> //t12.baidu.com
39 # link
40 # href -> //b1.bdstatic.com
41 # img
42 # src -> //www.baidu.com/img/bd_logo1.png

这个代码打印页面中的所有标签名与对应的link链接地址

如果我们要打印其中的所有img地址呢，那我们可能用$tag来判断是哪种标签，从而再进一步提取数据

具体可以看这里: perl HTML::LinkExtor模块(2)

perl HTML::LinkExtor模块(1)

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们

首页 > 代码库 > perl HTML::LinkExtor模块(1)

perl HTML::LinkExtor模块(1)

看完仍有疑问？有类似问题直接问程序猿