Tag Archives: nagios

Monitoring thermal sensors on Apple hardware with Nagios

Basic Thermal OverviewThe System Management Controller (SMC) in modern Apple hardware is responsible for a host of functions, one of which is controlling thermal systems within a machine. Having the ability to monitor thermal sensor values in a production Mac server can be very beneficial, providing insight into the ambient and component temperatures of the machine and providing notifications when thermal systems fail.

Added today to my Mac OS X Monitoring Tools project is check_osx_smc, a new plugin that can talk to the SMC in modern Mac hardware to read and report on temperature and fan speed values. It is capable of reporting on multiple sensors at once, checking against set warning and critical values, then returning performance data for further analysis and graphing.

The plugin, written in C/Obj-C, takes a list of SMC registers (keys which point to a particular value), value thresholds for warning and critical states, and a temperature scale (celsius/fahrenheit). In the example below, we are asking an SMC on a Late 2012 Mac mini Server for it’s ambient temperature in celsius as well as it’s primary fan speed:

./check_osx_smc -s c -r TA0P,F0Ac -w 70,5200 -c 85,5800

As the resulting temperature is above 70°C, we get thrown a warning:

WARNING - TA0P is 71.3C, F0Ac at 5354rpm | TA0P=71.3158;70.0;85.0; F0Ac=5354.0;5200.0;5800.0;

A practical example: recently, when the air conditioning unit in a client’s server room failed, the rising ambient temperature in a stack of Mac mini & Mac Pro servers triggered a warning and notification. In this case, cooling was able to be restored before systems started to overheat and go down. Even in a far less dense environment, monitoring the thermal statistics of a single server can help you catch tricky things like overheating components or a fan not maintaining it’s target speed.

The plugin, as well as it’s accompanying documentation, is now available in the OSX-Monitoring-Tools GitHub repository. I hope that it serves you well, and welcome any queries or suggestions anyone may have.

check_osx_smc on GitHub


Monitoring the new OS X Server Caching Service with Nagios

In followup to last week’s post on Monitoring Mac OS X Server Software Update Server, here is my new script to monitor the Caching Service on OS X Mountain Lion Server (Server.app 2.2+).

The Caching Service is a deceptively magical new service which automatically caches Apple update and Mac App Store content with no need for client configuration. A very good writeup of what the service actually does can be found at Noel Alonso’s blog.

In classic Apple style, the new Caching Service looks pretty bare on the surface, with a simple toggle and a slider to change the size of your cache. Whilst advanced configuration of the service is definitely possible; from a cursory glance, none of the options or statuses give you a really good idea of what is happening in the background.

This is where the following script works nicely. Checking that the service is active, registered, and accessible to clients provides a good way to monitor that caching of content should occur, but the performance data that the script returns gives you some excellent ways to monitor and analyse cache size and efficiency. As you can see in the graph above, I am able to get a good visual indication over time of how much content my server is caching, as well as how much cached content it is providing to clients versus downloading directly from Apple. In the next few weeks, I will write some articles on the RRDtool commands I am using to produce these graphs.

It is really satisfying to be able to see nice chunks of bandwidth that no longer have to come in via the internet. Enjoy!

check_osx_caching.sh on GitHub


Monitoring Mac OS X Server Software Update Server with Nagios

Many organisations use an internal Apple Software Update server to save bandwidth and control distribution of Apple supplied updates. With the introduction of the new Caching service in Mountain Lion server, the uptake in these kind of internal caching and update services is only going to rise.

The model for an internal Apple Software Update server however, has no easy failover or timeout built in, and therefore assumes excellent uptime for the service. If your service breaks down, users will not have access to any software updates, and many Mac admins will speak of how finicky and hateful the Software Update service can be at times.

With that, here is my Nagios monitoring script for the Software Update service. It will ensure the service is both running and accessible on it’s service port, then return performance data on the number of mirrored and enabled packages, as well as the overall size of your mirrored update cache.

I have also completed and am using a similar script for monitoring Apple’s new Caching service in Mountain Lion Server, and will release it this week after collecting enough historical data to make a pretty graph.

For now, here is check_osx_swupdate.sh:

check_osx_swupdate.sh on GitHub