Content, APM, Channel technologies

Datadog Cloud Monitoring Gains Watchdog Machine Learning Service


Overall, we are producing more data than ever before. Organizations are able to monitor even the smallest part of their business to make sure everything is running smoothly. The digital age waits for no one, and downtime is unacceptable for all businesses.

However, it is near impossible for humans to process and interpret this massive volume of data. As such many monitoring tools allow you to set specific parameters and receive notifications when behavior falls outside of expected parameters. But setting up monitoring systems has become increasingly complex, since data gathering now spans so many different systems and locations.

Seeking to address those challenges, Datadog has unveiled Watchdog, a machine learning-based monitoring capability that automatically identifies hidden issues and anomalies in dynamic, cloud-based applications. The service is now generally available to all customers on Datadog’s Enterprise APM (application performance monitoring) plan.

According to the company, Watchdog auto-detects performance problems in your applications without any manual setup or configuration. Using Datadog's machine learning algorithms, Watchdog automatically detects issues in your data, such as latency spikes in your microservices, elevated error rates on any of your endpoints, or network issues in one of your cloud provider’s zones, the company claims. In theory, those capabilities allow engineers to head off problems before they are felt by end users.

Watchdog combines many Datadog features including:

  • Anomaly Detection - By analyzing a metric’s historical behavior, anomaly detection distinguishes between normal and abnormal metric trends.
  • Outlier Detection - Automatically identify any host (or group of hosts) that is behaving abnormally compared to its peers.
  • Forecasting - Forecasting algorithms use machine learning to continuously evaluate a metric’s evolution and predict its future values. With forecasts, you can visualize expected trends and specify how far in advance you want to get alerted about potential issues, Datadog says.
  • Composite Alerting -  Combine two or more separate monitors using logical operators to further refine your alerts—for actionable insights without the plethora of alerts that require no action, the company asserts.

At launch, Watchdog now evaluates key application metrics, such as latency or error rates, from Datadog APM. The company is continually adding algorithms to apply Watchdog to new problems and additional types of monitoring data so it can detect new kinds of situations anywhere in your environment.

Author: Datadog’s Brad Menezes
Datadog's Brad Menezes

For more details on how it works, Brad Menezes of Datadog wrote an extensive blog post including screenshots to show the service in action. Menezes notes that Datadog enables you to reliably alert the right people at the right time, without the need to set up alerts ahead of time, which can be very attractive to MSPs.

As Datadog continues to grow its monitoring portfolio, it hopes to attract more MSPs as resellers. The company continues to invest in their partner program and release features like Watchdog that will appeal to service providers that need to monitor multiple networks across various environments. They do have significant competition in the market though from companies like Cisco AppDynamics, Dynatrace, New Relic and CA Technologies, among others.