码迷,mamicode.com
首页 > 编程语言 > 详细

python-文本爬虫demo-获取极客学院首页的图片

时间:2016-07-15 11:10:04      阅读:445      评论:0      收藏:0      [点我收藏+]

标签:

myReApp.py(图片将存放于pic目录下)
import re

#window下request的安装,cmd下输入pip install requests
import requests

#读取源代码文件
f = open("hello.txt","rb")
html = f.read()
#用utf-8解码,不然报错 TypeError: cannot use a string pattern on a bytes-like object
html = html.decode("utf-8")
f.close()

#匹配图片网址
pic_url = re.findall('img src="(.*?)" class="lessonimg"',html,re.S)
i = 0
for each in pic_url:
    print("now downloading:"+each)
    pic = requests.get(each)
    fp = open('pic\\'+str(i)+'.jpg',"wb")
    fp.write(pic.content)
    fp.close()
    i+=1

hello.txt
<div class="one-classfiy-lesson lesson-list" style="display: block;">
    <ul class="cf">


<li id="2834" test="0" deg="0" stop="1">
	<div class="lessonimg-box">


		<a href="http://www.jikexueyuan.com/course/2834.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2834&posColumn=2834.1">
			<img src="http://a1.jikexueyuan.com/home/201606/28/6efc/577219a265490.jpg" class="lessonimg" title="Axure RP8.0 与 Axure RP7.0 的区别" alt="Axure RP8.0 与 Axure RP7.0 的区别">

			<div class="lessonplay" style="opacity: 0;">
				<i class="playericon" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2834&posColumn=2834.1"></i>
			</div>


		</a>

	</div>

	<div class="lesson-infor lesson-hover" style="height: 88px;">
		<h2 class="lesson-info-h2"><a href="http://www.jikexueyuan.com/course/2834.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2834&posColumn=2834.1">Axure RP8.0 与 Axure RP7.0 的区别</a></h2>
		<p style="height: 0px; opacity: 0; display: none;">
			本课程讲解 Axure RP8.0 与 Axure RP7.0 的区别,通过对比  Axure RP8 和 Axure RP7.0 原型设计工具,分析软件界面、菜单栏区域、工具栏区域、站点地图区域、部件区域、部件交互区域、部件管理区域、用例编辑以及发布设置、浏览器显示等等方面区别,学会 Axure RP8.0 原型设计工具的使用。
		</p>
		<div class="timeandicon">
			<div class="cf">
				<dl>
					<dd class="mar-b8"><i class="time-icon"></i><em>5课时
							71分钟</em>
					</dd>
					<dd class="zhongji" style="display: block;">

						<i class="xinhao-icon"></i><em>初级</em>
					</dd>

				</dl>
				<em class="learn-number" style="display: block;">2848人学习</em>
			</div>
			<div class="cf">
				<div class="lessonicon-box" style="bottom: -2px;">

					<a href="http://www.jikexueyuan.com/course/axure/">
						<img width="16" src="http://wirrorcdn.jikexueyuan.com/category/pm.png" alt="axure" title="Axure" jktag="&posGP=101001&posArea=0002&posOper=1004&aCId=2834&posColumn=2834.1&aCCate=axure">
					</a>

				</div>
			</div>
		</div>
	</div>
</li>




<li id="2617" test="0" deg="0">
	<div class="lessonimg-box">


		<a href="http://www.jikexueyuan.com/course/2617.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2617&posColumn=2617.2">
			<img src="http://a1.jikexueyuan.com/home/201603/21/aee5/56ef575e4745e.jpg" class="lessonimg" title="微信公众平台开发实战:微信绑定功能" alt="微信公众平台开发实战:微信绑定功能">

			<div class="lessonplay" style="opacity: 0;">
				<i class="playericon" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2617&posColumn=2617.2"></i>
			</div>


		</a>

	</div>

	<div class="lesson-infor" style="height: 88px;">
		<h2 class="lesson-info-h2"><a href="http://www.jikexueyuan.com/course/2617.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2617&posColumn=2617.2">微信公众平台开发实战:微信绑定功能</a></h2>
		<p style="height: 0px; opacity: 0; display: none;">
			本课程学习微信公众平台绑定功能,分别介绍并实例讲解网站老用户的绑定以及手机号码绑定两种类型,同时也更加深入学习对MySQL数据库的操作。
		</p>
		<div class="timeandicon">
			<div class="cf">
				<dl>
					<dd class="mar-b8"><i class="time-icon"></i><em>3课时
							90分钟</em>
					</dd>
					<dd class="zhongji">

					<i class="xinhao-icon3"></i><em>高级</em>
					</dd>

				</dl>
				<em class="learn-number">7809人学习</em>
			</div>
			<div class="cf">
				<div class="lessonicon-box">

					<a href="http://www.jikexueyuan.com/course/phpbase/">
						<img width="16" src="http://a1.jikexueyuan.com/home/201604/28/538c/5721889c1d714.png" alt="phpbase" title="PHP 基础" jktag="&posGP=101001&posArea=0002&posOper=1004&aCId=2617&posColumn=2617.1&aCCate=phpbase">
					</a>

				</div>
			</div>
		</div>
	</div>
</li>




<li id="2614" test="0" deg="0">
	<div class="lessonimg-box">


		<a href="http://www.jikexueyuan.com/course/2614.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2614&posColumn=2614.3">
			<img src="http://a1.jikexueyuan.com/home/201603/18/43bb/56eb5e43e9b52.jpg" class="lessonimg" title="绘制专业的产品原型应用示例:猿题库 App 产品原型设计" alt="绘制专业的产品原型应用示例:猿题库 App 产品原型设计">

			<div class="lessonplay" style="opacity: 0;">
				<i class="playericon" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2614&posColumn=2614.3"></i>
			</div>


		</a>

	</div>

	<div class="lesson-infor" style="height: 88px;">
		<h2 class="lesson-info-h2"><a href="http://www.jikexueyuan.com/course/2614.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2614&posColumn=2614.3">绘制专业的产品原型应用示例:猿题库 App 产品原型设计</a></h2>
		<p style="height: 0px; opacity: 0; display: none;">
			本课程通过猿题库产品原型设计,来讲解如何绘制专业的产品原型设计,如何绘制好专业的产品原型,如何写好产品原型的交互说明,介绍一些绘制原型的注意事项。
		</p>
		<div class="timeandicon">
			<div class="cf">
				<dl>
					<dd class="mar-b8"><i class="time-icon"></i><em>5课时
							121分钟</em>
					</dd>
					<dd class="zhongji">

					<i class="xinhao-icon3"></i><em>高级</em>
					</dd>

				</dl>
				<em class="learn-number">4638人学习</em>
			</div>
			<div class="cf">
				<div class="lessonicon-box">

					<a href="http://www.jikexueyuan.com/course/axure/">
						<img width="16" src="http://wirrorcdn.jikexueyuan.com/category/pm.png" alt="axure" title="Axure" jktag="&posGP=101001&posArea=0002&posOper=1004&aCId=2614&posColumn=2614.1&aCCate=axure">
					</a>

				</div>
			</div>
		</div>
	</div>
</li>




<li id="2612" test="0" deg="0">
	<div class="lessonimg-box">


		<a href="http://www.jikexueyuan.com/course/2612.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2612&posColumn=2612.4">
			<img src="http://a1.jikexueyuan.com/home/201603/18/3129/56eb5d0f33347.jpg" class="lessonimg" title="Meteor RESTful 详解" alt="Meteor RESTful 详解">

			<div class="lessonplay" style="opacity: 0;">
				<i class="playericon" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2612&posColumn=2612.4"></i>
			</div>


		</a>

	</div>

	<div class="lesson-infor" style="height: 88px;">
		<h2 class="lesson-info-h2"><a href="http://www.jikexueyuan.com/course/2612.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2612&posColumn=2612.4">Meteor RESTful 详解</a></h2>
		<p style="height: 0px; opacity: 0; display: none;">
			本课程介绍了 RESTful 的相关概念和其在 Meteor 中的应用,演示了如何在 Meteor 中调用外部 REST 服务,如何使用 Meteor 创建 REST 服务接口。最后介绍了 Meteor 特有的 REST WebSocket,也就是 DDP (分布式数据协议)。
		</p>
		<div class="timeandicon">
			<div class="cf">
				<dl>
					<dd class="mar-b8"><i class="time-icon"></i><em>4课时
							37分钟</em>
					</dd>
					<dd class="zhongji">

						<i class="xinhao-icon"></i><em>初级</em>
					</dd>

				</dl>
				<em class="learn-number">2978人学习</em>
			</div>
			<div class="cf">
				<div class="lessonicon-box">

					<a href="http://www.jikexueyuan.com/course/meteor/">
						<img width="16" src="http://wirrorcdn.jikexueyuan.com/category/nodedotjs.png" alt="meteor" title="Meteor" jktag="&posGP=101001&posArea=0002&posOper=1004&aCId=2612&posColumn=2612.1&aCCate=meteor">
					</a>

				</div>
			</div>
		</div>
	</div>
</li>




<li id="2779" test="0" deg="0" stop="1">
	<div class="lessonimg-box">


		<i class="free-icon"></i>
		<a href="http://www.jikexueyuan.com/course/2779.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2779&posColumn=2779.5">
			<img src="http://a1.jikexueyuan.com/home/201605/23/fa97/5742b71557d06.jpg" class="lessonimg" title="Google  I/O 2016 技术揭秘与前瞻" alt="Google  I/O 2016 技术揭秘与前瞻">

			<div class="lessonplay" style="opacity: 0;">
				<i class="playericon" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2779&posColumn=2779.5"></i>
			</div>


		</a>

	</div>

	<div class="lesson-infor" style="height: 88px;">
		<h2 class="lesson-info-h2"><a href="http://www.jikexueyuan.com/course/2779.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2779&posColumn=2779.5">Google  I/O 2016 技术揭秘与前瞻</a></h2>
		<p style="height: 0px; opacity: 0; display: none;">
			本课程主要针对2016 Google I/O大会Keynote发布的内容进行剖析。讲解Google目前发展的技术趋势及主打方向。并针对开发者在大会中所需要了解的知识方向进行分析。让开发者们了解接下去需要学习哪些技能来应对新发布的产品及技术。以及如何进行学习。
		</p>
		<div class="timeandicon">
			<div class="cf">
				<dl>
					<dd class="mar-b8"><i class="time-icon"></i><em>4课时
							52分钟</em>
					</dd>
					<dd class="zhongji" style="display: none;">

						<i class="xinhao-icon"></i><em>初级</em>
					</dd>

				</dl>
				<em class="learn-number" style="display: none;">8981人学习</em>
			</div>
			<div class="cf">
				<div class="lessonicon-box" style="bottom: 4px;">

					<a href="http://www.jikexueyuan.com/course/android/">
						<img width="16" src="http://a1.jikexueyuan.com/home/201411/03/00a1/54578d8e652c3.png" alt="android" title="Android" jktag="&posGP=101001&posArea=0002&posOper=1004&aCId=2779&posColumn=2779.1&aCCate=android">
					</a>

				</div>
			</div>
		</div>
	</div>
</li>




<li id="2707" test="0" deg="0" stop="1">
	<div class="lessonimg-box">


		<i class="free-icon"></i>
		<a href="http://www.jikexueyuan.com/course/2707.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2707&posColumn=2707.6">
			<img src="http://a1.jikexueyuan.com/home/201604/26/0ff6/571f242278c8d.jpg" class="lessonimg" title="整站项目开发实战之网站首页布局搭建" alt="整站项目开发实战之网站首页布局搭建">

			<div class="lessonplay" style="opacity: 0;">
				<i class="playericon" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2707&posColumn=2707.6"></i>
			</div>


		</a>

	</div>

	<div class="lesson-infor" style="height: 88px;">
		<h2 class="lesson-info-h2"><a href="http://www.jikexueyuan.com/course/2707.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2707&posColumn=2707.6">整站项目开发实战之网站首页布局搭建</a></h2>
		<p style="height: 0px; opacity: 0; display: none;">
			设计本课程的目的是让 Web 前端工程师,可以轻松将设计师提供的 PSD 效果图,切分成 HTML 网页文件,学完本课程可以掌握 PS 软件的切图技巧和 HTML 布局技巧。
		</p>
		<div class="timeandicon">
			<div class="cf">
				<dl>
					<dd class="mar-b8"><i class="time-icon"></i><em>5课时
							85分钟</em>
					</dd>
					<dd class="zhongji" style="display: none;">

					<i class="xinhao-icon2"></i><em>中级</em>
					</dd>

				</dl>
				<em class="learn-number" style="display: none;">13884人学习</em>
			</div>
			<div class="cf">
				<div class="lessonicon-box" style="bottom: 4px;">

					<a href="http://www.jikexueyuan.com/course/html/">
						<img width="16" src="http://wirrorcdn.jikexueyuan.com/category/webbase.png" alt="html" title="HTML" jktag="&posGP=101001&posArea=0002&posOper=1004&aCId=2707&posColumn=2707.1&aCCate=html">
					</a>

				</div>
			</div>
		</div>
	</div>
</li>




<li id="2706" test="0" deg="0">
	<div class="lessonimg-box">


		<a href="http://www.jikexueyuan.com/course/2706.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2706&posColumn=2706.7">
			<img src="http://a1.jikexueyuan.com/home/201604/26/9759/571ecfcc2dbfa.jpg" class="lessonimg" title="Android Studio 全方位指南之进阶使用技巧" alt="Android Studio 全方位指南之进阶使用技巧">

			<div class="lessonplay" style="opacity: 0;">
				<i class="playericon" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2706&posColumn=2706.7"></i>
			</div>


		</a>

	</div>

	<div class="lesson-infor" style="height: 88px;">
		<h2 class="lesson-info-h2"><a href="http://www.jikexueyuan.com/course/2706.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2706&posColumn=2706.7">Android Studio 全方位指南之进阶使用技巧</a></h2>
		<p style="height: 0px; opacity: 0; display: none;">
			本课程主要介绍在使用 Android Studio 进行开发时会使用到的一些高级技巧,Gradle 的高级配置方案,多渠道打包,以及一些插件利器的安装使用。
		</p>
		<div class="timeandicon">
			<div class="cf">
				<dl>
					<dd class="mar-b8"><i class="time-icon"></i><em>4课时
							53分钟</em>
					</dd>
					<dd class="zhongji">

						<i class="xinhao-icon"></i><em>初级</em>
					</dd>

				</dl>
				<em class="learn-number">8257人学习</em>
			</div>
			<div class="cf">
				<div class="lessonicon-box">

					<a href="http://www.jikexueyuan.com/course/android/">
						<img width="16" src="http://a1.jikexueyuan.com/home/201411/03/00a1/54578d8e652c3.png" alt="android" title="Android" jktag="&posGP=101001&posArea=0002&posOper=1004&aCId=2706&posColumn=2706.1&aCCate=android">
					</a>

				</div>
			</div>
		</div>
	</div>
</li>




<li id="2704" test="0" deg="0">
	<div class="lessonimg-box">


		<a href="http://www.jikexueyuan.com/course/2704.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2704&posColumn=2704.8">
			<img src="http://a1.jikexueyuan.com/home/201604/25/12c3/571d7a1ed4ea4.jpg" class="lessonimg" title="微网站制作速成法" alt="微网站制作速成法">

			<div class="lessonplay" style="opacity: 0;">
				<i class="playericon" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2704&posColumn=2704.8"></i>
			</div>


		</a>

	</div>

	<div class="lesson-infor" style="height: 88px;">
		<h2 class="lesson-info-h2"><a href="http://www.jikexueyuan.com/course/2704.html" target="_blank" jktag="&posGP=103001&posArea=0002&posOper=1003&aCId=2704&posColumn=2704.8">微网站制作速成法</a></h2>
		<p style="height: 0px; opacity: 0; display: none;">
			本课程介绍微网站产生的背景,主要用途与制作实例,重点讲解在没有美工的情况下,微网站该如何制作。学完本课程,你将可以独自完成一个微网站的项目。
		</p>
		<div class="timeandicon">
			<div class="cf">
				<dl>
					<dd class="mar-b8"><i class="time-icon"></i><em>3课时
							58分钟</em>
					</dd>
					<dd class="zhongji">

					<i class="xinhao-icon2"></i><em>中级</em>
					</dd>

				</dl>
				<em class="learn-number">8471人学习</em>
			</div>
			<div class="cf">
				<div class="lessonicon-box">

					<a href="http://www.jikexueyuan.com/course/phpbase/">
						<img width="16" src="http://a1.jikexueyuan.com/home/201604/28/538c/5721889c1d714.png" alt="phpbase" title="PHP 基础" jktag="&posGP=101001&posArea=0002&posOper=1004&aCId=2704&posColumn=2704.1&aCCate=phpbase">
					</a>

				</div>
			</div>
		</div>
	</div>
</li>



    </ul>
</div>




python-文本爬虫demo-获取极客学院首页的图片

标签:

原文地址:http://blog.csdn.net/y545544032/article/details/51914948

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!