码迷,mamicode.com
首页 > 编程语言 > 详细

用python requests库写一个人人网相册爬虫

时间:2016-08-19 00:37:49      阅读:203      评论:0      收藏:0      [点我收藏+]

标签:

担心人人网会黄掉,写个爬虫,把我的相册照片都下载下来。代码如下:

# -*- coding: utf-8 -*-
import requests
import json
import os

def mkdir(path):
    path=path.strip()
    path=path.rstrip("\\")
    isExists=os.path.exists(path)
    if not isExists:
        print path+u 创建成功
        os.makedirs(path)
    else:
        print path+u 目录已存在

#执行该文件的主过程
if __name__ == __main__:
    #创建requests会话
    s = requests.Session()
    #hyg初始url-登录hyg
    origin_url = http://www.renren.com
    login_data = {
        email:账号,
        domain:renren.com,
        origURL:http://www.renren.com/home,
        key_id:1,
        captcha_type:web_login,
        password:通过抓包获取,
        rkey:通过抓包获取
    }
    r = s.post("http://www.renren.com/ajaxLogin/login?1=1&uniqueTimestamp=2016742045262", data = login_data)
    if true in r.content:
        print u登录人人网成功
    #访问相册
    albums = [album-987919974]
    album_url = http://photo.renren.com/photo/278382090/+albums[0]+/v7
    r = s.get(album_url)
    if "photoId" in r.content:
        print u进入相册成功
        #print r.content
        content = r.content
        index1 = content.find(nx.data.photo = )
        print index1
        index2 = content.find(; define.config)
        print index2
        target_json = content[index1+16:index2].strip()
        target_json = target_json[13:len(target_json)-2]
        print target_json
        data = json.loads(target_json.replace("\‘", "));
        photos = data[photoList]
        album_name = data[albumName]
        # 定义并创建目录
        album_path = d:\\+album_name
        print album_path
        mkdir(album_path)
        for photo in photos:
            #print photo[‘url‘]
            image_name = photo[photoId]
            photo_url = photo[url]
            r = requests.get(photo_url)
            image_path = album_path+/+image_name+.jpg
            f = open(image_path, wb)
            f.write(r.content)
            f.close()

搞定!运行效果如下:

技术分享

 

用python requests库写一个人人网相册爬虫

标签:

原文地址:http://www.cnblogs.com/LanTianYou/p/5785830.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!