码迷,mamicode.com
首页 > 编程语言 > 详细

scrapy 管理部署的爬虫项目的python类

时间:2020-07-05 19:07:10      阅读:71      评论:0      收藏:0      [点我收藏+]

标签:not   pwd   latest   ref   post   接口   requests   evel   port   

# 测试浏览器弹窗的验证:
import requests
from urllib import parse
import logging

logging.basicConfig(level=logging.INFO)


class ScrapyManager(object):
    def __init__(self, url, project_name, spider=None, username=None, pwd=None):
        self.url = url
        self.project_name = project_name
        self.spider = spider
        self.auth = (username, pwd)

    def start_project(self):
        """
        爬虫项目启动
        :return:
        """
        if not self.spider:
            raise Exception(‘未提供爬虫名称!‘)
        data = dict(
            project=self.project_name,
            spider=self.spider,
        )
        start_url = parse.urljoin(self.url, ‘schedule.json‘)
        res = requests.post(url=start_url, data=data, auth=self.auth)
        logging.info(res.text)

    def del_project(self):
        """
        项目删除
        :return:
        """
        data = dict(
            project=self.project_name,
            spider=self.spider,
        )
        start_url = parse.urljoin(self.url, ‘delproject.json‘)
        res = requests.post(url=start_url, data=data, auth=self.auth)
        logging.info(res.text)

    def stop_job(self, job_id):
        """
        停止任务
        :param job_id: 任务id
        :return:
        """
        data = dict(
            project=self.project_name,
            job=job_id,
        )
        start_url = parse.urljoin(self.url, ‘cancel.json‘)
        res = requests.post(url=start_url, data=data, auth=self.auth)
        logging.info(res.text)

还有部分api接口没有添加进来,可以参照官方的文档添加.

https://scrapyd.readthedocs.io/en/latest/api.html

scrapy 管理部署的爬虫项目的python类

标签:not   pwd   latest   ref   post   接口   requests   evel   port   

原文地址:https://www.cnblogs.com/qianxunman/p/13247149.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!