Should Your MSP Monitor Everything?

Marc-Andre Tanguay, head automation nerd, SolarWinds MSP
Author: Marc-Andre Tanguay, head automation nerd, SolarWinds MSP

I’m frequently asked about the best practices for monitoring. Partners often want to know if they should monitor a few things or if they should monitor everything. From talking with them, I’ve found partners fall into a few distinct categories when it comes to how they treat monitoring and alerting. Partners either:

  • Monitor everything and alert on everything (which makes things “noisy”)
  • Monitor everything but ignore 90% of it and only alert on the important things
  • Monitor everything but tweak thresholds so only things they care about show as failed—and use the rest for data collection
  • Monitor what they want and remove everything they find unnecessary or noisy
  • Monitor the least amount possible
  • Don’t alert on anything

What do you want to achieve with monitoring?

So what should you do? As usual, there is no one-size-fits-all approach. Instead, you should decide what’s important to you, what you want to react to, and what data you want to collect. And your RMM platform will inform those decisions. Each platform has different monitoring and alerting capabilities, as well as different pre-built and custom monitoring features you can use or add.

You should review your platform’s monitoring and alerting capabilities, as well as any additional custom monitoring. For example, if you use SolarWinds® RMM or N-central®, check out the Automation Cookbook to see what’s available.

After seeing what you can monitor, determine what’s important to you. Let’s look at SQL Server monitoring as an example. Partners often tell me they get alerts for SQL monitoring but they turn it off because they don’t know how to act on it. If you don’t use SQL much, getting an alert stating the Buffer Cache Hit Ratio is below 70% may not mean much to you.

If the information isn’t important to you, decide whether you should have it for reporting purposes in case a customer calls, have it show as failed, or have it as something you should alert on it. Typically, monitoring more isn’t a bad thing. However, if you don’t plan to act on it, you should probably disable alerting and/or tweak any thresholds so you aren’t alerted.

How to decide what to monitor

To determine what you should monitor, look at the list of alerts in your RMM dashboard and figure out which ones are important to you. Then, look at the various services and service templates. See what is available and what you’re not using that you may want to use. Over the years, lots of MSPs have removed/disabled templates and services because they felt they were noisy or didn’t know what to do with them—so it’s good to revisit and see what you may be missing.

So where does that leave you? Personally, I like to monitor more rather than less if you already paid for the license to manage the device with your RMM. If that’s the case, adding more monitoring isn’t a cost, it’s a feature.

The biggest mistake I see partners make is leaving default thresholds, or turning on alerting on everything. When they do that, it becomes too noisy and they tend to switch everything off.

Most RMM and PSA integrations give you the flexibility to choose which monitoring will trigger alerts. This is where I recommend you only alert on what you care about but keep monitoring things so you can gather data for that phone call from your customer asking for help troubleshooting some obscure error that nobody knows how to fix.

In short, in my opinion, more is better—and if your RMM can do it, why not use it?

Automation/Monitoring spotlight

While we’re on the subject of monitoring, I thought I’d also share a popular script we created to allow you to monitor a long list of AV products. It’s frequently updated so make sure to come back to the article if you need an updated script. Check it out here.

Also, if you’re interested in more scripts like this, take a look at our Automation Cookbook – We created it to help our partners collaborate and share their scripts. It’s called a cookbook because it has a collection of automation recipes that MSPs can take and use on their own. It currently contains over 380 curated and reviewed scripts and is constantly growing with the support of MSPs everywhere, as well as the Automation Team at SolarWinds.

Author Marc-Andre Tanguay is head automation nerd at SolarWinds MSP. Read more guest blogs from SolarWinds MSP here.