Home > OpsMgr > How to stop false heartbeat alerts for DMZ servers

How to stop false heartbeat alerts for DMZ servers

One of the strange things in OpsMgr is the relationship between Health Service Watcher object and the Root Management Server. A common mistake is to think that when we point a server to a gateway server (GW) or a Management Server (MS), the GW or MS are responsible to alert us about the availability of the monitored server.

This is not the case in the current version of OpsMgr (hopefully next version will help us dill with it better). All  Health service Watcher objects are placed on the RMS, and if the GW server is down we will get a lot of “Computer not reachable” & “Health Service Heartbeat Failure” for servers that are up and running!!!

HB_RMS

Lets start with a common scenario where we have a GW server that is connected to a MS thru a FW.

GW

in this scenario when the MS,GW or FW is down we will get a lot of false alarms in the console that alert us that all agents behind the FW are down.

We have 2 options to work around this:

Option 1: 

GW1

Add another GW server (GW2) and set all agents to failover to the new one in case of a failure in GW1. (How to failover an agent\GW). take in mind that if the FW or network devices that connect the GW to the MS fail you will still get all the unwanted alerts.

Option 2: We need to create an override for the 2 monitors “Computer not reachable” & “Health Service Heartbeat Failure” and to create a rule on the GW server that will catch an event when a monitored server is down.

1. Create a group that contain all health service watcher (agent) in the DMZ, in my case it was easy, I just needed to exclude all my internal domains agents

hb_2

2. Go to authoring pane and search for the 2 monitor “Computer not reachable” & “Health Service Heartbeat Failure” and set an override to the group created in step 1.

hb_3

hb_4

3. Create an event rule that catch the following event and assign the rule only to the GW server.

hb_5

in the end you will have 2 overtraded monitors and a new event rule

Overrides

Hope this will help you to lower the number of false notification alerts.

Advertisements
Categories: OpsMgr Tags: , , ,
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: