首页 > 代码库 > python抓取新浪首页的小例子

python抓取新浪首页的小例子

参考

廖雪峰的python教程:http://www.liaoxuefeng.com/wiki/001374738125095c955c1e6d8bb493182103fac9270762a000/001386832653051fd44e44e4f9e4ed08f3e5a5ab550358d000

代码:

 1 #!/usr/bin/python 2  3 # import module 4 import socket 5 import io 6  7 # create TCP object 8 s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) 9 # connect sina10 s.connect((www.sina.com.cn, 80))11 # send request12 s.send(GET / HTTP/1.1\r\nHost: www.sina.com.cn\r\nConnection: close\r\n\r\n)13 # receive data14 buffer = []15 while True:16     # every time receive 1k data17     d = s.recv(1024)18     if d:19         buffer.append(d)20     else:21         break22 data = http://www.mamicode.com/‘‘.join(buffer)23 # close socket24 header, html = data.split(\r\n\r\n, 1)25 print header26 # write receive data to file27 with open(sina.html, wb) as f:28     f.write(html)

主要功能是模拟浏览器访问网页服务器,并从网页服务器获取返回信息

python抓取新浪首页的小例子