码迷,mamicode.com
首页 > 编程语言 > 详细

Python系列之入门篇——HDFS

时间:2018-01-22 14:10:46      阅读:198      评论:0      收藏:0      [点我收藏+]

标签:部署   turn   false   dfs   als   容错性   上传   写文件   from   

Python系列之入门篇——HDFS

简介

HDFS (Hadoop Distributed File System) Hadoop分布式文件系统,具有高容错性,适合部署在廉价的机器上。Python
提供了两种接口方式,分别是hdfscli(Restful Api Call),pyhdfs(RPC Call),这一节主要讲hdfscli的使用

代码示例

  1. 安装

    pip install hdfs
  2. 引入相关模块

    from hdfs import *
  3. 创建客户端

    """
    It has two different kind of client, Client and InsecureClient.
    Client: cannot define file owner
    InsecureClient: can define file owner, default None
    """
    hdfs_root_path = ‘http://localhost:50070‘
    fs = Client(hdfs_root_path)
    fs = InsecureClient(hdfs_root_path, user=‘hdfs‘)
  4. 创建目录

    """
    Change file permission to 777, default None
    """
    fs.makedirs(‘/test‘, permission=777)
  5. 写文件

    """
    Write append or not depends on the file is exist or not
    strict: If `False`, return `None` rather than raise an exception if
          the path doesn‘t exist.
    """
    content = fs.content(hdfs_file_path, strict=False)
    if content is None:
        fs.write(‘/test/test.txt‘, data=data, permission=777)
    else:
        fs.write(‘/test/test.txt‘, data=data, append=True)
  6. 上传文件

    """
    overwrite default False, if don‘t set True, when you upload the file which is exist
    in hdfs, it will raise File is exist Exception.
    """
    client.upload(hdfs_path, local_path, overwrite=True)
  7. 总结
    还没有找到判断文件是否存在的方法,目前代码示例中用fs.content()来替换,如果大家有更好的方式,也麻烦分享给我

Python系列之入门篇——HDFS

标签:部署   turn   false   dfs   als   容错性   上传   写文件   from   

原文地址:https://www.cnblogs.com/dzqk/p/8328510.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!