Home > OpsMgr > StateChangeEvent table and grooming problem in OperationsMananger Database

StateChangeEvent table and grooming problem in OperationsMananger Database

I noticed this problem several times in different customer sites. the problem is that the OperationsManager DB is growing and the grooming process is skipping the state change events table.

There are several good blogs that explain how the groom process works like:

Steve Rachui ,Kevin Holman and Daniele Grandini, I used their knowledge to free some space used by operations manager database.

To verify the space used in the DB, just execute the standard report named “Disk usage by top table” in the SQL management studio console

clip_image002

The following query give us indication which monitors are the noisiest. keep in mind that some are in a result of config churn and we must tune them.

select distinct top 50 count(sce.StateId) as NumStateChanges, m.MonitorName, mt.typename AS TargetClass
from StateChangeEvent sce with (nolock)
join state s with (nolock) on sce.StateId = s.StateId
join monitor m with (nolock) on s.MonitorId = m.MonitorId
join managedtype mt with (nolock) on m.TargetManagedEntityType = mt.ManagedTypeId
where m.IsUnitMonitor = 1
group by m.MonitorName,mt.typename
order by NumStateChanges desc

clip_image004

Opssss…..This seems to be a lot of state change events!!!

Since we already take care and changed some of the discoveries to deal with config churn, we wanted to find out for how long we keep data in the database, to do so just run the following query, take in mind that the result is in days.

SELECT DATEDIFF(d, MIN(TimeAdded), GETDATE()) AS [Current] FROM statechangeevent

clip_image006

It’s time to loose some weight! (the query is in the bottom of the page)

here are the results:

clip_image008

Most noisiest monitors in the database

clip_image010

How many old change state data in Days

clip_image012

and as you can see we align the data with the grooming settings.

clip_image014

To delete the old data from State Change event DB run the following:

USE [OperationsManager]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
BEGIN

    SET NOCOUNT ON

    DECLARE @Err int
    DECLARE @Ret int
    DECLARE @DaysToKeep tinyint
    DECLARE @GroomingThresholdLocal datetime
    DECLARE @GroomingThresholdUTC datetime
    DECLARE @TimeGroomingRan datetime
    DECLARE @MaxTimeGroomed datetime
    DECLARE @RowCount int
    SET @TimeGroomingRan = getutcdate()

    SELECT @GroomingThresholdLocal = dbo.fn_GroomingThreshold(DaysToKeep, getdate())
    FROM dbo.PartitionAndGroomingSettings
    WHERE ObjectName = ‘StateChangeEvent’

    EXEC dbo.p_ConvertLocalTimeToUTC @GroomingThresholdLocal, @GroomingThresholdUTC OUT
    SET @Err = @@ERROR

    IF (@Err <> 0)
    BEGIN
        GOTO Error_Exit
    END

    SET @RowCount = 1  

    — This is to update the settings table
    — with the max groomed data
    SELECT @MaxTimeGroomed = MAX(TimeGenerated)
    FROM dbo.StateChangeEvent
    WHERE TimeGenerated < @GroomingThresholdUTC

    IF @MaxTimeGroomed IS NULL
        GOTO Success_Exit

    — Instead of the FK DELETE CASCADE handling the deletion of the rows from
    — the MJS table, do it explicitly. Performance is much better this way.
    DELETE MJS
    FROM dbo.MonitoringJobStatus MJS
    JOIN dbo.StateChangeEvent SCE
        ON SCE.StateChangeEventId = MJS.StateChangeEventId
    JOIN dbo.State S WITH(NOLOCK)
        ON SCE.[StateId] = S.[StateId]
    WHERE SCE.TimeGenerated < @GroomingThresholdUTC
    AND S.[HealthState] in (0,1,2,3)

    SELECT @Err = @@ERROR
    IF (@Err <> 0)
    BEGIN
        GOTO Error_Exit
    END

    WHILE (@RowCount > 0)
    BEGIN
        — Delete StateChangeEvents that are older than @GroomingThresholdUTC
        — We are doing this in chunks in separate transactions on
        — purpose: to avoid the transaction log to grow too large.
        DELETE TOP (10000) SCE
        FROM dbo.StateChangeEvent SCE
        JOIN dbo.State S WITH(NOLOCK)
            ON SCE.[StateId] = S.[StateId]
        WHERE TimeGenerated < @GroomingThresholdUTC
        AND S.[HealthState] in (0,1,2,3)

        SELECT @Err = @@ERROR, @RowCount = @@ROWCOUNT

        IF (@Err <> 0)
        BEGIN
            GOTO Error_Exit
        END
    END   

    UPDATE dbo.PartitionAndGroomingSettings
    SET GroomingRunTime = @TimeGroomingRan,
        DataGroomedMaxTime = @MaxTimeGroomed
    WHERE ObjectName = ‘StateChangeEvent’

    SELECT @Err = @@ERROR, @RowCount = @@ROWCOUNT

    IF (@Err <> 0)
    BEGIN
        GOTO Error_Exit
    END 
Success_Exit:
Error_Exit:   
END

Advertisements
Categories: OpsMgr Tags:
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: