码迷,mamicode.com
首页 > 编程语言 > 详细

python中级---->pymongo存储json数据

时间:2017-11-21 23:57:08      阅读:242      评论:0      收藏:0      [点我收藏+]

标签:priority   insert   may   前期准备   span   tor   lag   个人   4.0   

  这里面我们介绍一下python中操作mangodb的第三方库pymongo的使用,以及简单的使用requests库作爬虫。人情冷暖正如花开花谢,不如将这种现象,想成一种必然的季节。

 

pymongo的安装及前期准备

一、mangodb的安装以及启动

测试机器:win10, mangodb版本v3.4.0,python版本3.6.3。

mangodb的安装目录:D:\Database\DataBase\Mongo。数据的存放目录:E:\data\database\mango\data。首先我们启动mangodb服务器的:可以看到在本地27017端口成功启动server。

D:\Database\DataBase\Mongo\Server\3.4\bin>mongod --dbpath E:\data\database\mango\data
2017-11-21T20:48:38.458+0800 I CONTROL  [initandlisten] MongoDB starting : pid=20484 port=27017 dbpath=E:\data\database\mango\data 64-bit host=Linux
2017-11-21T20:48:38.461+0800 I CONTROL  [initandlisten] targetMinOS: Windows 7/Windows Server 2008 R2
2017-11-21T20:48:38.462+0800 I CONTROL  [initandlisten] db version v3.4.0
2017-11-21T20:48:38.463+0800 I CONTROL  [initandlisten] git version: f4240c60f005be757399042dc12f6addbc3170c1
2017-11-21T20:48:38.464+0800 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1t-fips  3 May 2016
2017-11-21T20:48:38.465+0800 I CONTROL  [initandlisten] allocator: tcmalloc
2017-11-21T20:48:38.466+0800 I CONTROL  [initandlisten] modules: none
2017-11-21T20:48:38.466+0800 I CONTROL  [initandlisten] build environment:
2017-11-21T20:48:38.467+0800 I CONTROL  [initandlisten]     distmod: 2008plus-ssl
2017-11-21T20:48:38.468+0800 I CONTROL  [initandlisten]     distarch: x86_64
2017-11-21T20:48:38.469+0800 I CONTROL  [initandlisten]     target_arch: x86_64
2017-11-21T20:48:38.469+0800 I CONTROL  [initandlisten] options: { storage: { dbPath: "E:\data\database\mango\data" } }
2017-11-21T20:48:38.491+0800 I -        [initandlisten] Detected data files in E:\data\database\mango\data created by the wiredTiger storage engine, so setting the active storage engine to wiredTiger.
2017-11-21T20:48:38.493+0800 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=5573M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),
2017-11-21T20:48:39.931+0800 I CONTROL  [initandlisten]
2017-11-21T20:48:39.933+0800 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-11-21T20:48:39.936+0800 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-11-21T20:48:39.940+0800 I CONTROL  [initandlisten]
2017-11-21T20:48:41.253+0800 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory E:/data/database/mango/data/diagnostic.data
2017-11-21T20:48:41.259+0800 I NETWORK  [thread1] waiting for connections on port 27017

mangodb客户端的启动:D:\Database\DataBase\Mongo\Server\3.4\bin\mongo.exe。双击即可运行

MongoDB shell version v3.4.0
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.0
Server has startup warnings:
2017-11-21T20:48:39.931+0800 I CONTROL  [initandlisten]
2017-11-21T20:48:39.933+0800 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2017-11-21T20:48:39.936+0800 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2017-11-21T20:48:39.940+0800 I CONTROL  [initandlisten]
>

 

二、python中pymongo的安装

pip install pymongo

这里简单的介绍一下pymongo的使用,这里面的代码是选自github的入门例子。

>>> import pymongo
>>> client = pymongo.MongoClient("localhost", 27017)
>>> db = client.test
>>> db.name
utest
>>> db.my_collection
Collection(Database(MongoClient(localhost, 27017), utest), umy_collection)
>>> db.my_collection.insert_one({"x": 10}).inserted_id
ObjectId(4aba15ebe23f6b53b0000000)
>>> db.my_collection.insert_one({"x": 8}).inserted_id
ObjectId(4aba160ee23f6b543e000000)
>>> db.my_collection.insert_one({"x": 11}).inserted_id
ObjectId(4aba160ee23f6b543e000002)
>>> db.my_collection.find_one()
{ux: 10, u_id: ObjectId(4aba15ebe23f6b53b0000000)}
>>> for item in db.my_collection.find():
...     print(item["x"])
...
10
8
11
>>> db.my_collection.create_index("x")
ux_1
>>> for item in db.my_collection.find().sort("x", pymongo.ASCENDING):
...     print(item["x"])
...
8
10
11
>>> [item["x"] for item in db.my_collection.find().limit(2).skip(1)]
[8, 11]

 

pymongo的使用例子

一、python爬虫以及pymongo存储数据

import requests
import pymongo
import json

def requestData():
    url = http://****.com/*.do
    data = {
        projectId: 90,
        myTaskFlag: 1,
        userId: 40
    }
    json_data = requests.post(url, data=json.dumps(data)).json()
    return json_data


def output_data(json_data):
    client = pymongo.MongoClient(host=localhost, port=27017)
    db = client.test
    collection = db.tasks
    tasks_data = json_data.get(taskList)
    collection.insert(tasks_data)


if __name__ == __main__:
    json_data = requestData()
    output_data(json_data)

我们把得到的数据存放在tasks集合中,这里使用的是mangodb默认的test数据库。运行完程序,我们可以通过mangodb的客户端查看数据,运行:db.tasks.find().pretty()可以查询tasks集合的所有数据。

{
        "_id" : ObjectId("5a1427a2edc9f04be40bc02d"),
        "taskId" : 1,
        "summary" : "PC版“个人信息”页面优化",
        "status" : 8,
        "categoryId" : 3,
        "creatorId" : 7,
        "projectId" : 1,
        "dateSubmit" : NumberLong("1481105108000"),
        "level" : 1,
        "handlerId" : 2,
        "ViewState" : 2,
        "priority" : 2
} {
        "_id" : ObjectId("5a1427a2edc9f04be40bc02e"),
        "taskId" : 2,
        "summary" : "PC版“添加新任务”界面字体太大",
        "status" : 8,
        "categoryId" : 3,
        "creatorId" : 7,
        "projectId" : 1,
        "dateSubmit" : NumberLong("1481105195000"),
        "level" : 1,
        "handlerId" : 2,
        "ViewState" : 2,
        "priority" : 1
}

 

友情链接

 

python中级---->pymongo存储json数据

标签:priority   insert   may   前期准备   span   tor   lag   个人   4.0   

原文地址:http://www.cnblogs.com/huhx/p/baseusepythonpymongo.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!