码迷,mamicode.com
首页 > 其他好文 > 详细

hadoop中日志聚集问题

时间:2014-05-19 10:33:13      阅读:369      评论:0      收藏:0      [点我收藏+]

标签:des   style   blog   class   code   c   

遇到的问题:

bubuko.com,布布扣

当点击上面的logs时,会出现下面问题:

bubuko.com,布布扣

这个解决方案为:

By default, Hadoop stores the logs of each container in the node where that container was hosted. While this is irrelevant if you‘re just testing some Hadoop executions in a single-node environment (as all the logs will be in your machine anyway), with a cluster of nodes, keeping track of the logs can become quite a bother. In addition, since logs are kept on the normal filesystem, you may run into storage problems if you keep logs for a long time or have heterogeneous storage capabilities.

Log aggregation is a new feature that allows Hadoop to store the logs of each application in a central directory in HDFS. To activate it, just add the following to yarn-site.xmland restart the Hadoop services:

bubuko.com,布布扣
 <property>
    <description>Whether to enable log aggregation</description>
    <name>yarn.log-aggregation-enable</name>
    <value>true</value>
  </property>
bubuko.com,布布扣

By adding this option, you‘re telling Hadoop to move the application logs to hdfs:///logs/userlogs/<your user>/<app id>. You can change this path and other options related to log aggregation by specifying some other properties mentioned in the default yarn-site.xml (just do a search for log.aggregation).

However, these aggregated logs are not stored in a human readable format so you can‘t just cat their contents. Fortunately, Hadoop developers have included several handy command line tools for reading them:

bubuko.com,布布扣
# Read logs from any YARN application
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId>
 
# Read logs from MapReduce jobs
$HADOOP_HOME/bin/mapred job -logs <jobId>
 
# Read it in a scrollable window with search (type ‘/‘ followed by your query).
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> | less
 
# Or just save it to a file and use your favourite editor
$HADOOP_HOME/bin/yarn logs -applicationId <applicationId> > log.txt
bubuko.com,布布扣

You can also access these logs via a web app for MapReduce jobs by using the JobHistory daemon. This daemon can be started/stopped by running the following:

bubuko.com,布布扣
# Start JobHistory daemon
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh start historyserver
# Stop JobHistory daemon
$HADOOP_PREFIX/sbin/mr-jobhistory-daemon.sh stop historyserver
bubuko.com,布布扣

My Fabric script includes an optional variable for setting the node where to launch this daemon so it is automatically started/stopped when you run fab start or fab stop.

Unfortunately, a generic history daemon for universal web access to aggregated logs does not exist yet. However, as you can see by checking YARN-321, there‘s considerable work being done in this area. When this gets introduced I‘ll update this section.

hadoop中日志聚集问题,布布扣,bubuko.com

hadoop中日志聚集问题

标签:des   style   blog   class   code   c   

原文地址:http://www.cnblogs.com/rolly-yan/p/3731734.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!