搜索关键字：bs4，搜索到922个结果！码迷,mamicode.com！

利用request、beautifulsoup、xml写多线程爬虫

# -*- coding:UTF-8 -*- import requests,time from collections import OrderedDict import threading from bs4 import BeautifulSoup as bp t3 = time.time() ... ...

分类：编程语言时间：2017-06-02 17:23:48 阅读次数：202

Python爬虫：新浪新闻详情页的数据抓取（函数版）

上一篇文章《Python爬虫：抓取新浪新闻数据》详细解说了如何抓取新浪新闻详情页的相关数据，但代码的构建不利于后续扩展，每次抓取新的详情页时都需要重新写一遍，因此，我们需要将其整理成函数，方便直接调用。详情页抓取的6个数据：新闻标题、评论数、时间、来源、正文、责任编辑。首先，我们先将评论数整理成 ...

分类：编程语言时间：2017-06-02 13:28:54 阅读次数：275

抓取小猪短租1000张列表页内容

代码如下 ...

分类：其他好文时间：2017-05-31 22:11:58 阅读次数：162

Python3使用BeautifulSoup4爬取《三国演义》

#!/sur/bin/python#conding=utf-8import urllib.requestfrom bs4 import BeautifulSoupurl="http://www.shicimingju.com/book/sanguoyanyi.html" # 要爬取的网络地址menu ...

分类：编程语言时间：2017-05-29 22:53:49 阅读次数：336

BeautifulSoup 库

https://www.crummy.com/software/BeautifulSoup/bs4/doc.zh/#id4 中文版BeautifulSoup库作用提取HTML和XML文档中的数据修改、导航、查找文档创建html_doc >>> html_doc = """... <html> ...

分类：其他好文时间：2017-05-29 12:04:02 阅读次数：212

python 爬取qidian某一页全部小说

1 import re 2 import urllib.request 3 from bs4 import BeautifulSoup 4 import time 5 6 url=input("输入任一页的网址：") 7 8 def gethtml(url): 9 #获取页面源代码... ...

分类：编程语言时间：2017-05-25 13:26:13 阅读次数：258

python 爬qidian小说

1 import re 2 import urllib.request 3 from bs4 import BeautifulSoup 4 import time 5 6 url=input("第一章网址：") 7 8 def gethtml(url): 9 #获取页面源代码htm... ...

分类：编程语言时间：2017-05-24 22:44:32 阅读次数：254

安装BeautifulSoup4

解决bs4在Python 3.5下出现“ImportError: cannot import name 'HTMLParseError'”错误分类：Python （4251）（3）分类：Python （4251）（3）升级了Python3.5之后，我使用BeautifulSoup4时候出现 ...

分类：其他好文时间：2017-05-20 11:09:29 阅读次数：224

共922条上一页 1 ... 75 76 77 78 79 ... 93 下一页

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)