bs4解析拉勾网网页

时间：2021-06-25 16:38:55 阅读：0 评论：0 收藏：0 [点我收藏+]

标签：read urlopen http sele urllib 分析 type href import

from urllib.request import urlopen
from bs4 import BeautifulSoup as BS
url = "http://www.lagou.com"
# (1)获取response对象
response = urlopen(url)
# (2)获得response对象下的源码
html = response.read().decode()
# (3)创建BS对象
bs = BS(html,"html.parser")
# (4)信息提取
a_list = bs.select("a")
for i in a_list:
    print(i)
    # select和find find_all完全同bs对象下的方法一致，也就是可以对i进行进一步的标签分析
    # print(i.select("font"))
    # print(type(i))
    # 1)i.get(key) key代表传入的属性
    # print(i.get("href"))
    # 2)获得标签中间夹的文件内容
    print(i.text)

bs4解析拉勾网网页

标签：read urlopen http sele urllib 分析 type href import

原文地址：https://www.cnblogs.com/ayajing/p/14928296.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行