码迷,mamicode.com
首页 > 其他好文 > 详细

scrapy | downloader middleware

时间:2019-02-15 15:28:24      阅读:147      评论:0      收藏:0      [点我收藏+]

标签:gen   col   roc   rom   use   ike   div   color   RoCE   

1.User-Agent

scrapy默认的由UserAgentMiddleware设置为  "User-Agent": "Scrapy/1.5.1 (+https://scrapy.org)"

一、可以在setting中设置USER-AGENT设置

1 USER_AGENT=Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.71 Safari/537.36

二、自定义随机user-agent 设置完成后在setting中解放

 1 class RandomMiddlewares(object):
 2     def __init__(self):
 3         self.user_agent=[Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11,
 4                          Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Chrome/10.0.648.133 Safari/534.16,
 5                          Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36,
 6                          Mozilla/5.0 (compatible; Baiduspider/2.0; - +http://www.baidu.com/search/spider.html),
 7                          Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html),]
 8 
 9     def process_request(self,request,spider):
10         request.headers[User-Agent]=choice(self.user_agent)

 

scrapy | downloader middleware

标签:gen   col   roc   rom   use   ike   div   color   RoCE   

原文地址:https://www.cnblogs.com/404NooFound/p/10383604.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!