使用Python爬取腾讯房产的新闻，用的Python库：requests 、re、time、BeautifulSoup ????

首页 > 代码库 > 使用Python爬取腾讯房产的新闻，用的Python库：requests 、re、time、BeautifulSoup ????

使用Python爬取腾讯房产的新闻，用的Python库：requests 、re、time、BeautifulSoup ????

2024-09-17 12:37:49 218人阅读

import requests
import re
import time
from bs4 import BeautifulSoup

today = time.strftime(‘%Y-%m-%d‘,time.localtime(time.time()))

one_url = ‘http://hz.house.qq.com‘    #用来构建新的URL的链接

url = ‘http://hz.house.qq.com/zxlist/bdxw.htm‘      #需要爬取的网址
html = requests.get(url)
html.encoding = html.apparent_encoding
reg = re.compile(r‘<a target="_blank" class="tit f-l f16 blue" href="http://www.mamicode.com/(.*?)">(.*?)</a><span class="tm f-r gray">(.*?)</span>‘)
html_lis = re.findall(reg,html.text)

for html_li in html_lis:
    new_url = one_url + html_li[0]
    new_time = html_li[2][0:10]             #分割获取到的新闻日期，对比今天的日期和获取到的新闻日期，相同的话就打印出来，不相同就跳过不打印
    if new_time == today:
        print(html_li[1],new_url)
        new_html = requests.get(new_url)            
        soup = BeautifulSoup(new_html.text,‘html.parser‘)
        contents = soup.find_all(‘p‘,style="TEXT-INDENT: 2em")
        for content in contents:
            if content.string != None:
                print(content.string)
            else:
                continue
        print(‘----------------------------下一篇新闻----------------------------‘)
    else:
        break
#可以建立函数来介绍代码的重复

使用Python爬取腾讯房产的新闻，用的Python库：requests 、re、time、BeautifulSoup ????

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们

首页 > 代码库 > 使用Python爬取腾讯房产的新闻，用的Python库：requests 、re、time、BeautifulSoup ????

使用Python爬取腾讯房产的新闻，用的Python库：requests 、re、time、BeautifulSoup ????

看完仍有疑问？有类似问题直接问程序猿