Posts Tagged ‘distributed application’

Failed heartbeat unnoticed in Distributed Application

Written by Ingmar Verheij on July 12th, 2011. Posted in Monitoring

Server down

System Center Operations Manager (SCOM) monitors the health of systems with an agent. One of the most basic checks that is executed is a health check of the agent itself. One of the checks is a heartbeat between the agent and the RMS (Root Management Server). If the heartbeat is lost for three times (configurable), the agent is considered unavailable.Health Service Heartbeat Failure

An alert is generated and (if configured) a notification is send to inform the administrator that there is a problem.

But if a Distributed Application is configured to monitor a chain of components, this failure remains unnoticed.

Node state 'Healthy'

Nodes that are unmonitored are grey and appear to be ‘Healthy’, which is strange for a node who’s heartbeat haven’t reported for quite some time.

SCOM : Configure notification for distributed applications

Written by Ingmar Verheij on May 24th, 2011. Posted in Operations Manager

Events generated by System Center Operations Manager (SCOM), like alerts and warnings, usually indicate (upcoming) problems. Notifiying you’re system administrators enables you to troubleshoot te problem as quickly as possible.

For a customer I’ve configured multiple distributed applications. Each distributed application defines a critical application that needs to be monitored. All distributed applications are displayed on a monitor showing the state of the distributed application.

 

When an event is triggered, for instance because the service is down, a notification needs to be sent. Not only to the system adminstrators, who administer the infrastructure, but also to the technical and functional application operator.

Active Directory groups are used to make the membership of the managable, since role based access control (RBAC) is used.

 

Donate