码迷,mamicode.com
首页 > 其他好文 > 详细

scrapy mongo pipeline

时间:2021-02-19 13:41:33      阅读:0      评论:0      收藏:0      [点我收藏+]

标签:pen   type   set   pymongo   close   line   orm   url   class   

import pymongo

db_configs = {
    ‘type‘: ‘mongo‘,
    ‘host‘: ‘127.0.0.1‘,
    ‘port‘: ‘27017‘,
    "user": "",
    "password": "",
    ‘db_name‘: ‘spider‘
}


class MongoPipeline():
    def __init__(self):
        self.db_name = db_configs.get("db_name")
        self.host = db_configs.get("host")
        self.port = db_configs.get("port")
        self.username = db_configs.get("user")
        self.password = db_configs.get("passwd")


    def open_spider(self, spider):
        self.client = pymongo.MongoClient(‘mongodb://{}:{}‘.format(self.host, self.port), connect=False, maxPoolSize=10)
        if self.username and self.password:
            self.db = self.client[self.db_name].authenticate(self.username, self.password)
        self.db = self.client[self.db_name]

    def close_spider(self, spider):
        self.client.close()

    def process_item(self, item, spider):
        collection_name = spider.name
        self.db[collection_name].update_one({"url": item["url"]}, {‘$set‘: item}, upsert=True)
        return item

scrapy mongo pipeline

标签:pen   type   set   pymongo   close   line   orm   url   class   

原文地址:https://www.cnblogs.com/c-x-a/p/14411831.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!