Beautiful Soup中的find，find_all

时间：2017-11-20 21:51:54 阅读：124 评论：0 收藏：0 [点我收藏+]

标签：open ext ram class 分享图片 ges http 生产 beautiful

1.一般来说，为了找到BeautifulSoup对象内任何第一个标签入口，使用find()方法。

技术分享图片

以上代码是一个生态金字塔的简单展示，为了找到第一生产者，第一消费者或第二消费者，可以使用Beautiful Soup。

找到第一生产者：

生产者在第一个<url>标签里，因为生产者在整个html文档中第一个<url>标签中出现，所以可以使用find()方法找到第一生产者，在ecologicalpyramid.py

中写入下面一段代码，使用ecologicalpyramid.html文件创建BeautifulSoup对象。

from bs4 import BeautifulSoup
with open(‘ecologicalpyramid.html‘, ‘r‘) as ecological_pyramid:　　　　# ecological 生态系统  pyramid 金字塔
　　soup = BeautifulSoup(ecological_pyramid)
producer_entries = soup.find(‘ul‘)
print(producer_entries.li.div.string)

输出结果：plants

2.find()说明

find函数：

find(name, attrs, recursive, text, **wargs)　　　　# recursive 递归的，循环的

这些参数相当于过滤器一样可以进行筛选处理。不同的参数过滤可以应用到以下情况：

查找标签，基于name参数
查找文本，基于text参数
基于正则表达式的查找
查找标签的属性，基于attrs参数
基于函数的查找

通过标签查找：

可以传递任何标签的名字来查找到它第一次出现的地方。找到后，find函数返回一个BeautifulSoup的标签对象。

from bs4 import BeautifulSoup

with open(‘ecologicalpyramid.html‘, ‘r‘) as ecological_pyramid:
　　soup = BeautifulSoup(ecological_pyramid, ‘html‘)
producer_entries = soup.find(‘ul‘)
print(type(producer_entries))

输出的得到 <class ‘bs4.element.Tag‘>

通过文本查找：

直接字符串的话，查找的是标签。如果想要查找文本的话，则需要用到text参数。如下所示：

from bs4 import BeautifulSoup

with open(‘ecologicalpyramid.html‘, ‘r‘) as ecological_pyramid:
　　soup = BeautifulSoup(ecological_pyramid, ‘html‘)
producer_string = soup.find(text = ‘plants‘)
print(plants_string)

输出：plants

http://blog.csdn.net/abclixu123/article/details/38502993

Beautiful Soup中的find，find_all

标签：open ext ram class 分享图片 ges http 生产 beautiful

原文地址：http://www.cnblogs.com/keye/p/7868059.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行