Archive

Posts Tagged ‘Notifications’

Operations Manager 2007 R2: The Notification challenge

02/06/2011 Leave a comment

Recently, more and more customers want more granular/complex notification using SCOM 2007. As you probably know, notification in operations manager is based on 4 different channels (E-Mail, SMS, IM and Command). These 4 channels can be used to deliver alerts based on specific criteria (the criteria are part of a subscription).

Subscription = what to send + to + how

The Criteria in the subscription changed from OpsMgr SP1, and luckily OpsMgr team add some great features to the product like: send alerts that “raised by a specific rule or monitor” or “raised by any instance in a specific group or class”. Despite these changes, it still hard to set up alerts based on the needs of our enterprise, especially if we dealing with large and complex environment.

That’s why i can clearly say that it’s hard to get notification to work as we want and in most cases, without fully understand the object oriented class model of OpsMgr, it’s even harder.

I made a list of “What most of my customers wants”

1. Ability to manage and maintain notification information in a reliable and simple way.

2. Ability to limit (in some cases) the notification to only one alert. For example, when we have a server with IIS role that hold several web sites, all web site are monitored. When IIS stop working we will get alerts for all the web sites hosted on the same server.

3. Trace which alert was send to a recipient, when and how (Mail, SMS, Etc.).

4. Ability to set up on-call list. (Duty Roster)

5. The Ability to let IT personal to route alerts to others in case of a vacations (like out of office mechanism).

6. Ability to ensure notifications reaches the designated personal using two way communication and escalation

Most of the above are not present in Operations Manager, and with the problems that we had we needed to search for alternatives.

image

One of the products that I tested was SNS++ from Highnet Systems, in the beginning we needed to use a command channel to send our notifications, but after several meeting with Highnet guys they started to develop a connector that connect with the universal connector that ships with SCOM R2.

Now after successfully implemented SNS++  (very simple to do I must add) at several customers, most of my notification problems are not part of Operations Manager 2007, all alerts are forwarded to SNS++ and with the Smart Routing feature of the product we can deliver any alert to any recipient base of message filters in SNS++.

For more information you can visit Highnet Systems.

Categories: OpsMgr Tags: ,

How to stop false heartbeat alerts for DMZ servers

12/04/2011 Leave a comment

One of the strange things in OpsMgr is the relationship between Health Service Watcher object and the Root Management Server. A common mistake is to think that when we point a server to a gateway server (GW) or a Management Server (MS), the GW or MS are responsible to alert us about the availability of the monitored server.

This is not the case in the current version of OpsMgr (hopefully next version will help us dill with it better). All  Health service Watcher objects are placed on the RMS, and if the GW server is down we will get a lot of “Computer not reachable” & “Health Service Heartbeat Failure” for servers that are up and running!!!

HB_RMS

Lets start with a common scenario where we have a GW server that is connected to a MS thru a FW.

GW

in this scenario when the MS,GW or FW is down we will get a lot of false alarms in the console that alert us that all agents behind the FW are down.

We have 2 options to work around this:

Option 1: 

GW1

Add another GW server (GW2) and set all agents to failover to the new one in case of a failure in GW1. (How to failover an agent\GW). take in mind that if the FW or network devices that connect the GW to the MS fail you will still get all the unwanted alerts.

Option 2: We need to create an override for the 2 monitors “Computer not reachable” & “Health Service Heartbeat Failure” and to create a rule on the GW server that will catch an event when a monitored server is down.

1. Create a group that contain all health service watcher (agent) in the DMZ, in my case it was easy, I just needed to exclude all my internal domains agents

hb_2

2. Go to authoring pane and search for the 2 monitor “Computer not reachable” & “Health Service Heartbeat Failure” and set an override to the group created in step 1.

hb_3

hb_4

3. Create an event rule that catch the following event and assign the rule only to the GW server.

hb_5

in the end you will have 2 overtraded monitors and a new event rule

Overrides

Hope this will help you to lower the number of false notification alerts.

Categories: OpsMgr Tags: , , ,

OpsMgr Command line notification problem and fixes

11/01/2010 Leave a comment

When I use a notification command channels to run scripts that forward alert information to external system, the command line includes several alert parameters.

The annoying thing is that when an alert is missing one parameter, the notification command is not executed and event 21409 is logged in the opsmgr log.

I bump into this problem after when a customer asked me to create several alertsmonitors using the SCOM GUI. This rulesmonitors used to check simple event log parameters.

When alert was raised because of these rules i noticed that the "ManagedEntityPath" is empty!!!

and if the channel for this command notification use the "Alert Source" as a notification parameter it includes two parameters "$Data/Context/DataItem/ManagedEntityPath$ $Data/Context/DataItem/ManagedEntityDisplayName$".

The first parameter is empty and the notification fails.

“The process could not be started because some of the data items could not be resolved…”

I found 2 useful workarounds.

The first one is to include only the "$Data/Context/DataItem/ManagedEntityDisplayName$" on a dedicated channel and subscription for a specific rules and monitors.

the second (and a better approach) is to set a default for the problematic parameter as i found here:

https://connect.microsoft.com/OpsMgr/feedback/ViewFeedback.aspx?FeedbackID=497677

$Data[Default=’ ’]/Context/DataItem/ManagedEntityPath$$Data/Context/DataItem/ManagedEntityDisplayName$

You can set it as space, just like the above example, or any other string you like.

Categories: OpsMgr Tags: ,