After moving to our new monitoring infrastructure we are going to be swithing to a 1 second interval for all client services that use our agent and to 5 seconds for agentless clients. This is 15 and 12 times faster respectively than our previous implementation and will help in making uptime monitors more accurate.

The change is made since we noticed that some client equipment would timeout for less than 1 second mostly due to load in their service and thus would count a bigger downtime (up to a 59 seconds in the worst case).

Please note that the change is only for client servers and services, eSG still polls its own hardware and services every 250ms to report uptime. Some clients that wanted to make the changes in their software stack are also able to use the faster polling time.

Finally the RRD graphs are going to be updated to count for the faster poll time in the next 24 hours, they will start showing up after a full 24 hour period is graphed.

Regards,
eSG NOC

Kedd, Január 19, 2016





« Vissza