Posts Tagged ‘monitoring’

Monitor Atlantis ILIO via SNMP

Written by Ingmar Verheij on June 14th, 2013. Posted in Atlantis ILIO, Scripting / Programming

Atlantis ILIO Center can monitor the health and availability of ILIO Session- and Replication hosts. If necessary ILIO Center sends alerts via SMTP (aka Mail) or via a SNMP trap (push). However, if the ILIO Center experiences an unexpected shutdown or network connectivity is lost, a SNMP trap is never sent (or in case of lost of network connectivity: the monitoring system does not receive the trap).

Additional monitoring can be done via ping or by monitoring the state of the VM. However, verifying the availability of a machine via ping could result in false-positives and not each monitoring system can monitor the state of a VM, therefore you might want to monitor via SNMP polling (pull).

A vanilla installation of Atlantis ILIO (Center) does not have an SNMP agent installed, therefore I wrote a script that can push the agent and/or update the configuration. You can find the download at the bottom of this article.

Maintenance mode

Written by Ingmar Verheij on October 20th, 2010. Posted in Nagios / GroundWork, Operations Manager

Monitoring servers, services and connections is great. It enables pro-active management, notification and escalation and improves root cause analysis.

One big challenge is the number of notifications being sent and the relevance of those notifications. A well set-up environment sents notifications when problems raise or a negative trend is detected. Signals for the Administrator to get out of his lazy chair.
Most environments, however, sent more notifications then needed and are often irrelevant. This causes a negative effect, the mailbox fills up rapidly and the value of the message decrease.

An example of a not well-planned monitoring environment is a reboot schedule. Especially when terminal servers are periodically rebooted, or re-deployed, servers maybe be unreachable once in a while. The monitoring software assumes the server is in trouble and would cause an alert and sent notifications.

Donate