标签:blog ar os on div log ad html ef
简单爬虫常用
#获取网络内容
def getWebContent(url):
    headers = {‘User-Agent‘:‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11‘}
    req = urllib2.Request(url, headers=headers)
    html = urllib2.urlopen(req).read()
    return html
#下载图片
def downloadImage(imageUrl,localName):
    data = urllib2.urlopen(imageUrl).read()
    fi=open(localName,‘wb‘)
    fi.write(data)
    fi.close()
标签:blog ar os on div log ad html ef
原文地址:http://www.cnblogs.com/pengzhong/p/PythonCustomFunctions.html