码迷,mamicode.com
首页 > 其他好文 > 详细

Icinga State Types

时间:2014-08-18 12:03:34      阅读:279      评论:0      收藏:0      [点我收藏+]

标签:des   http   os   io   strong   for   ar   art   

5.8. State Types

5.8.1. Introduction

The current state of monitored services and hosts is determined by two components:

  • The status of the service or host (i.e. OK, WARNING, UP, DOWN, etc.)

  • Tye type of state the service or host is in

There are two state types in Icinga - SOFT states and HARD states. These state types are a crucial part of the monitoring logic, as they are used to determine when event handlersare executed and when notifications are initially sent out.

This document describes the difference between SOFT and HARD states, how they occur, and what happens when they occur.

5.8.2. Service and Host Check Retries

In order to prevent false alarms from transient problems, Icinga allows you to define how many times a service or host should be (re)checked before it is considered to have a "real" problem. This is controlled by the max_check_attempts option in the host and service definitions. Understanding how hosts and services are (re)checked in order to determine if a real problem exists is important in understanding how state types work.

5.8.3. Soft States

Soft states occur in the following situations...

  • When a service or host check results in a non-OK or non-UP state and the service check has not yet been (re)checked the number of times specified by themax_check_attempts directive in the service or host definition. This is called a soft error.

  • When a service or host recovers from a soft error. This is considered a soft recovery.

The following things occur when hosts or services experience SOFT state changes:

  • The SOFT state is logged.

  • Event handlers are executed to handle the SOFT state.

SOFT states are only logged if you enabled the log_service_retries or log_host_retries options in your main configuration file.

The only important thing that really happens during a soft state is the execution of event handlers. Using event handlers can be particularly useful if you want to try and proactively fix a problem before it turns into a HARD state. The $HOSTSTATETYPE$ or $SERVICESTATETYPE$ macros will have a value of "SOFT" when event handlers are executed, which allows your event handler scripts to know when they should take corrective action. More information on event handlers can be found here.

5.8.4. Hard States

Hard states occur for hosts and services in the following situations:

  • When a host or service check results in a non-UP or non-OK state and it has been (re)checked the number of times specified by the max_check_attempts option in the host or service definition. This is a hard error state.

  • When a host or service transitions from one hard error state to another error state (e.g. WARNING to CRITICAL).

  • When a service check results in a non-OK state and its corresponding host is either DOWN or UNREACHABLE.

  • When a host or service recovers from a hard error state. This is considered to be a hard recovery.

  • When a passive host check is received. Passive host checks are treated as HARD unless the passive_host_checks_are_soft option is enabled.

The following things occur when hosts or services experience HARD state changes:

  • The HARD state is logged.

  • Event handlers are executed to handle the HARD state.

  • Contacts are notifified of the host or service problem or recovery.

The $HOSTSTATETYPE$ or $SERVICESTATETYPE$ macros will have a value of "HARD" when event handlers are executed, which allows your event handler scripts to know when they should take corrective action. More information on event handlers can be found here.

5.8.5. Example

Here‘s an example of how state types are determined, when state changes occur, and when event handlers and notifications are sent out. The table below shows consecutive checks of a service over time. The service has a max_check_attempts value of 3.

Time

Check #

State

State Type

State Change

Notes

0

1

OK

HARD

No

Initial state of the service

1

1

CRITICAL

SOFT

Yes

First detection of a non-OK state. Event handlers execute.

2

2

WARNING

SOFT

Yes

Service continues to be in a non-OK state. Event handlers execute.

3

3

CRITICAL

HARD

Yes

Max check attempts has been reached, so service goes into a HARD state. Event handlers execute and a problem notification is sent out. Check # is reset to 1 immediately after this happens.

4

1

WARNING

HARD

Yes

Service changes to a HARD WARNING state. Event handlers execute and a problem notification is sent out.

5

1

WARNING

HARD

No

Service stabilizes in a HARD problem state. Depending on what the notification interval for the service is, another notification might be sent out.

6

1

OK

HARD

Yes

Service experiences a HARD recovery. Event handlers execute and a recovery notification is sent out.

7

1

OK

HARD

No

Service is still OK.

8

1

UNKNOWN

SOFT

Yes

Service is detected as changing to a SOFT non-OK state. Event handlers execute.

9

2

OK

SOFT

Yes

Service experiences a SOFT recovery. Event handlers execute, but notification are not sent, as this wasn‘t a "real" problem. State type is set HARD and check # is reset to 1 immediately after this happens.

10

1

OK

HARD

No

Service stabilizes in an OK state.

Icinga State Types,布布扣,bubuko.com

Icinga State Types

标签:des   http   os   io   strong   for   ar   art   

原文地址:http://www.cnblogs.com/sanquanfeng/p/3919063.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!