搜索关键字：urlopen，搜索到699个结果！码迷,mamicode.com！

python实现一个简单的爬虫搜索功能

html.parser?HTMLParser?? urllib.request?urlopen?? urllib?parse LinkParser(HTMLParser): ????handle_starttag(,?tag,?attrs): ????????tag?==?: ????????????(key,?val...

分类：编程语言时间：2015-10-29 01:00:57 阅读次数：533

Python-爬虫初学

#爬取网站中的图片 1 import re #正则表达式库 2 import urllib #url链接库 3 4 def getHtml(url): 5 page = urllib.urlopen(url) #打开链接 6 html = page.read() ...

分类：编程语言时间：2015-10-16 15:10:12 阅读次数：278

Python正则表达式使用过程中的小细节

今天用Python写了个简单的爬虫程序，抓取虎扑篮球(nba.hupu.com)的首页内容，代码如下：1 #coding:gb23122 import urllib2, re3 webpage = urllib2.urlopen('http://nba.hupu.com')4 text = webp...

分类：编程语言时间：2015-10-15 01:12:49 阅读次数：201

Python爬虫学习笔记（一）

1.urllib2简介urllib2的是爬取URL（统一资源定位器）的Python模块。它提供了一个非常简单的接口，使用urlopen函数。它能够使用多种不同的协议来爬取URL。它还提供了一个稍微复杂的接口，用于处理常见的情况 - 如基本身份验证，cookies，代理等。2.抓取URLs使用urli...

分类：编程语言时间：2015-10-13 22:28:04 阅读次数：527

python3爬虫之入门和正则表达式

用python抓取指定页面：代码如下：import urllib.requesturl= "http://www.baidu.com"data = urllib.request.urlopen(url).read()#data = data.decode('UTF-8')print(data)url...

分类：编程语言时间：2015-10-09 00:33:23 阅读次数：329

使用urllib编写python爬虫

新版python中，urllib和urllib2合并了，统一为urllib(1)简单爬取网页import urllibcontent = urllib.request.urlopen(req).read().decode("utf-8")(2)添加headerimport urllibreq = u...

分类：编程语言时间：2015-10-03 14:20:07 阅读次数：216

使用bs4对海投网内容信息进行提取并存入mongodb数据库

example: http://xyzp.haitou.cc/article/722427.html首先是直接下载好每个页面，可以使用 os.system( "wget "+str(url)) 或者urllib2.urlopen(url) ，很简单不赘述。然后，重头戏，进行信息抽取：#!/usr/....

分类：数据库时间：2015-09-29 18:47:43 阅读次数：190

关于Python中输出中文的一点疑问

#encoding=gb2312import urllibimport redef getHtml(url): page = urllib.urlopen(url) html = page.read() return htmldef getImg(html): reg = r...

分类：编程语言时间：2015-09-29 07:38:59 阅读次数：334

Python代理设置

deftest3(): url="http://www.ip.cn" proxy_handler=urllib2.ProxyHandler({‘http‘:‘http://username:password@host:port‘}) opener=urllib2.build_opener(proxy_handler); urllib2.install_opener(opener) conn=urllib2.urlopen(url) printconn.read()

分类：编程语言时间：2015-09-23 19:37:36 阅读次数：202

Python网络爬虫 - 一个简单的爬虫例子

下面我们创建一个真正的爬虫例子爬取我的博客园个人主页首页的推荐文章列表和地址scrape_home_articles.pyfrom urllib.request import urlopenfrom bs4 import BeautifulSoupimport rehtml = urlopen("h...

分类：编程语言时间：2015-09-23 13:12:05 阅读次数：208

共699条上一页 1 ... 54 55 56 57 58 ... 70 下一页

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)