start/
scrapy.cfg
start/
__init__.py
items.py
pipelines.py
settings.py
spiders/
__init__.py
各文件的作用如下:
from scrapy.spider import BaseSpider
class QiushiSpider(BaseSpider):
name = "qiushi"
allowed_domains = ["qiushibaike.com","www.qiushibaike.com"]
start_urls = ["http://www.qiushibaike.com/"]
def parse(self,response):
filename = response.url.split("/")[-2]
open(filename,'wb').write(response.body)返回项目主目录,执行scrapy crawl qiushi原文地址:http://blog.csdn.net/luoyestudio/article/details/42177775